### Introduction

_{1}and θ

_{2}, respectively) that arise from a single ancestral population (size θ

_{a}) at time

*T*in the past, while the two populations may exchange migrants at rates m

_{S}_{1}and m

_{2}([1-3] for notations). Both population sizes and migration rates are assumed to be constant over time [4].

*no migrations*) and even phylogeny have been addressed by using a multispecies coalescent framework [5-11]. However, ignoring migrations can result in a biased estimation of splitting times of populations/species and may lead to a wrong phylogenetic tree estimation [12-16]. Efforts to distinguish between isolation and migration began about 20 years ago, and many methods have employed a Markov chain Monte Carlo (MCMC) simulation to infer an IM model [2-4,17-21]. However, most methods have a major roadblock of a long computational time of an MCMC simulation, which typically limits the amount of data that can be analyzed [1]. In addition, the joint estimation of both phylogeny and an IM model is known to be tremendously difficult [16].

### DNA Alignments

### Standard Model Structure

_{1},

*θ*

_{2},

*θ*

_{a},

*m*

_{1},

*m*

_{2},

*T*). The

_{S}*i*th locus

*D*out of

_{i}*L*loci are the observations, and the genealogy

*G*of

_{i}*D*is a latent variable that we cannot observe typically (Fig. 2). Fig. 2 depicts the structure of the standard models. The standard models address two levels of uncertainty: the distribution of DNA sequences given genealogy and that of genealogy given an IM model [11,12,25]. We typically assume that there is no recombination within a locus and free recombination between loci. In other words, the

_{i}*i*th locus

*D*out of

_{i}*L*loci has as its own genealogy

*G*and loci are independent. Given genealogy, the genetic data and demography ψ=(θ

_{i}_{1},

*θ*

_{2},

*θ*

_{a},

*m*

_{1},

*m*

_{2},

*T*) are assumed to be conditionally independent. As the distribution of DNA sequences

_{S}### MCMC Simulation and the Mixing Problem

*t*-1)th iteration for the genealogy of one locus and all demographic parameters ψ including splitting time

*t*th iteration, we propose a new splitting time

*q*and either accept the new value

^{*}and ψ

^{t-1}includes

^{t}, … ψ

^{n}~

### Inference Methods

### IMa3

*a posteriori*(MAP) estimate with highest posterior density intervals of the splitting time based on the marginal posterior density

_{S}, G

_{1}, ... , G

_{L}). The approximated densities

*hidden migrations*,” that occurred earlier than the splitting time so that a newly proposed splitting time or phylogeny is not instantly rejected but evaluated with non-zero acceptance probability. For example, if a newly proposed splitting time is younger than existing migrations (Fig. 4B), the migration paths older than splitting time are considered hidden migration paths (

*M*

_{H}) and the genealogy is the one without hidden migrations and compatible with the new splitting time. In other words, the current genealogy, given the new splitting time, is a so-called “

*hidden genealogy*”

*G*. Given phylogeny τ and demographic parameters ψ, the distribution of the hidden genealogy is partitioned into those of hidden migrations and the genealogy without hidden migrations:

_{H}=(G, M_{H})^{1}, ... , τ

^{n}from the MCMC samples approximately follow the marginal posterior

### MIST

*coalescent trees*λ

_{_}) via an MCMC simulation. Note that no information about a demographic model is necessary in the first step, which alleviates the mixing problem. Second, the joint posterior density

_{i}=(

*λ*)and

_{i}, M_{i}*M*is the set of all migration information. This rewrites Eq. (2) as follows:

_{i}### AIM

*L*and

_{1}*L*are lineages of

_{2}*λ*at time

*t*. AIM implements this independence approximation rather than the exact density

_{A,B}is a scaler that is estimated between every pair of coexisting populations/species,

*δ*is the time to the most recent common ancestor from populations A and B coexisted, and

_{AB}*m*is an estimated migration rate that allows for a prior distribution on the magnitude of the migration rate expected. This parameterization allows for smaller migration rates between more distant populations. Furthermore, each scaler

_{tot}*α*~Exp(1) and all scalers are assumed to be independent. AIM is able to use the priors previously implemented for species tree estimation in starBEAST2 [44].

_{A,B}