-
Notifications
You must be signed in to change notification settings - Fork 104
Add FAQ section to improve user guidance #608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
978017b
d9a46e3
41ab37b
2b97c35
6ec9c7e
571c44a
6c6e32f
d231882
886429e
8734528
3a15298
85ecfbc
2726c1a
8cdc3c7
e864d7d
1c72064
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
…ing.jl models and parallelism usage
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -14,52 +14,118 @@ x ~ filldist(Normal(), 2) | |||||
|
||||||
You cannot directly condition on `x[2]` using `condition(model, @varname(x[2]) => 1.0)` because `x[2]` never appears on the LHS of a `~` statement. Only `x` as a whole appears there. | ||||||
|
||||||
However, there is an important exception: when you use the broadcasting operator `.~` with a univariate distribution, each element is treated as being separately drawn from that distribution, allowing you to condition on individual elements: | ||||||
|
||||||
```julia | ||||||
@model function f1() | ||||||
x = Vector{Float64}(undef, 3) | ||||||
x .~ Normal() # Each element is a separate draw | ||||||
end | ||||||
|
||||||
m1 = f1() | (@varname(x[1]) => 1.0) | ||||||
sample(m1, NUTS(), 100) # This works! | ||||||
``` | ||||||
|
||||||
In contrast, you cannot condition on parts of a multivariate distribution because it represents a single distribution over the entire vector: | ||||||
|
||||||
```julia | ||||||
@model function f2() | ||||||
x = Vector{Float64}(undef, 3) | ||||||
x ~ MvNormal(zeros(3), I) # Single multivariate distribution | ||||||
end | ||||||
|
||||||
m2 = f2() | (@varname(x[1]) => 1.0) | ||||||
sample(m2, NUTS(), 100) # This doesn't work! | ||||||
``` | ||||||
|
||||||
The key insight is that `filldist` creates a single distribution (not N independent distributions), which is why you cannot condition on individual elements. The distinction is not just about what appears on the LHS of `~`, but whether you're dealing with separate distributions (`.~` with univariate) or a single distribution over multiple values (`~` with multivariate or `filldist`). | ||||||
|
||||||
To understand more about how Turing determines whether a variable is treated as random or observed, see: | ||||||
- [Compiler Design Overview](../developers/compiler/design-overview/) - explains the heuristics Turing uses | ||||||
- [Core Functionality](../core-functionality/) - basic explanation of the `~` notation and conditioning | ||||||
|
||||||
## How do I implement a sampler for a Turing.jl model? | ||||||
|
||||||
We have comprehensive guides on implementing custom samplers: | ||||||
- [Implementing Samplers Tutorial](../developers/inference/implementing-samplers/) - step-by-step guide on implementing samplers in the AbstractMCMC framework | ||||||
- [AbstractMCMC-Turing Interface](../developers/inference/abstractmcmc-turing/) - how to integrate your sampler with Turing | ||||||
- [AbstractMCMC Interface](../developers/inference/abstractmcmc-interface/) - the underlying interface documentation | ||||||
|
||||||
## Can I use parallelism / threads in my model? | ||||||
|
||||||
Yes! Turing.jl supports both multithreaded and distributed sampling. See the [Core Functionality guide](../core-functionality/#sampling-multiple-chains) for detailed examples showing: | ||||||
- Multithreaded sampling using `MCMCThreads()` | ||||||
- Distributed sampling using `MCMCDistributed()` | ||||||
Yes, but with important caveats! There are two types of parallelism to consider: | ||||||
|
||||||
### 1. Parallel Sampling (Multiple Chains) | ||||||
Turing.jl fully supports sampling multiple chains in parallel: | ||||||
- **Multithreaded sampling**: Use `MCMCThreads()` to run one chain per thread | ||||||
- **Distributed sampling**: Use `MCMCDistributed()` for distributed computing | ||||||
|
||||||
See the [Core Functionality guide](../core-functionality/#sampling-multiple-chains) for examples. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
### 2. Threading Within Models | ||||||
Using threads inside your model (e.g., `Threads.@threads`) requires more care: | ||||||
|
||||||
```julia | ||||||
@model function f(x) | ||||||
Threads.@threads for i in eachindex(x) | ||||||
x[i] ~ Normal() # UNSAFE: Assume statements in threads can crash! | ||||||
end | ||||||
end | ||||||
AoifeHughes marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
``` | ||||||
|
||||||
**Important limitations:** | ||||||
- **Observe statements**: Generally safe to use in threaded loops | ||||||
- **Assume statements** (sampling statements): Often crash unpredictably or produce incorrect results | ||||||
- **AD backend compatibility**: Many AD backends don't support threading. Check the [multithreaded column in ADTests](https://turinglang.org/ADTests/) for compatibility | ||||||
|
||||||
For safe parallelism within models, consider vectorized operations instead of explicit threading. | ||||||
|
||||||
## How do I check the type stability of my Turing model? | ||||||
|
||||||
Type stability is crucial for performance. Check out: | ||||||
- [Performance Tips](../usage/performance-tips/) - includes specific advice on type stability | ||||||
- [Automatic Differentiation](../usage/automatic-differentiation/) - contains benchmarking utilities using `DynamicPPL.TestUtils.AD` | ||||||
- [Performance Tips]({{< meta usage-performance-tips >}}) - includes specific advice on type stability | ||||||
- Use `DynamicPPL.DebugUtils.model_warntype` to check type stability of your model | ||||||
|
||||||
## How do I debug my Turing model? | ||||||
|
||||||
For debugging both statistical and syntactical issues: | ||||||
- [Troubleshooting Guide](../usage/troubleshooting/) - common errors and their solutions | ||||||
- [Troubleshooting Guide]({{< meta usage-troubleshooting >}}) - common errors and their solutions | ||||||
- For more advanced debugging, DynamicPPL provides `DynamicPPL.DebugUtils` for inspecting model internals | ||||||
AoifeHughes marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
## What are the main differences between Turing, BUGS, and Stan syntax? | ||||||
## What are the main differences between Turing and Stan syntax? | ||||||
|
||||||
Key syntactic differences include: | ||||||
|
||||||
- **Parameter blocks**: Stan requires explicit `data`, `parameters`, `transformed parameters`, and `model` blocks. In Turing, everything is defined within the `@model` macro | ||||||
AoifeHughes marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
- **Variable declarations**: Stan requires upfront type declarations in parameter blocks. Turing infers types from the sampling statements | ||||||
- **Transformed data**: Stan has a `transformed data` block for preprocessing. In Turing, data transformations should be done before defining the model | ||||||
- **Generated quantities**: Stan has a `generated quantities` block. In Turing, use the approach described in [Tracking Extra Quantities]({{< meta usage-tracking-extra-quantities >}}) | ||||||
|
||||||
Example comparison: | ||||||
```stan | ||||||
// Stan | ||||||
data { | ||||||
int<lower=0> N; | ||||||
vector[N] y; | ||||||
} | ||||||
parameters { | ||||||
real mu; | ||||||
real<lower=0> sigma; | ||||||
} | ||||||
model { | ||||||
y ~ normal(mu, sigma); | ||||||
} | ||||||
``` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I'm not a super duper expert on Stan, but I think that these parameters don't have priors assigned and thus have completely flat priors (i.e. the prior probability is always 1 for any value of mu and any value of sigma > 0). That would make this not equivalent to the Turing model which has non-flat priors. I think you would need to specify There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. updated this in the next commit - just pushed |
||||||
|
||||||
While there are many syntactic differences, key advantages of Turing include: | ||||||
- **Julia ecosystem**: Full access to Julia's profiling and debugging tools | ||||||
- **Parallel computing**: Much easier to use distributed and parallel computing inside models | ||||||
- **Flexibility**: Can use arbitrary Julia code within models | ||||||
- **Extensibility**: Easy to implement custom distributions and samplers | ||||||
```julia | ||||||
# Turing | ||||||
@model function my_model(y) | ||||||
mu ~ Normal(0, 1) | ||||||
sigma ~ truncated(Normal(0, 1), 0, Inf) | ||||||
AoifeHughes marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
y ~ Normal(mu, sigma) | ||||||
end | ||||||
``` | ||||||
|
||||||
## Which automatic differentiation backend should I use? | ||||||
|
||||||
The choice of AD backend can significantly impact performance. See: | ||||||
- [Automatic Differentiation Guide](../usage/automatic-differentiation/) - comprehensive comparison of ForwardDiff, Mooncake, ReverseDiff, and other backends | ||||||
- [Performance Tips](../usage/performance-tips/#choose-your-ad-backend) - quick guide on choosing backends | ||||||
- [Automatic Differentiation Guide]({{< meta usage-automatic-differentiation >}}) - comprehensive comparison of ForwardDiff, Mooncake, ReverseDiff, and other backends | ||||||
- [Performance Tips]({{< meta usage-performance-tips >}}#choose-your-ad-backend) - quick guide on choosing backends | ||||||
- [AD Backend Benchmarks](https://turinglang.org/ADTests/) - performance comparisons across various models | ||||||
|
||||||
For more specific recommendations, check out the [DifferentiationInterface.jl tutorial](https://juliadiff.org/DifferentiationInterface.jl/DifferentiationInterfaceTest/stable/tutorial/). | ||||||
|
||||||
## I changed one line of my model and now it's so much slower; why? | ||||||
|
||||||
Small changes can have big performance impacts. Common culprits include: | ||||||
|
@@ -68,4 +134,4 @@ Small changes can have big performance impacts. Common culprits include: | |||||
- Inadvertently causing AD backend incompatibilities | ||||||
- Breaking assumptions that allowed compiler optimizations | ||||||
|
||||||
See our [Performance Tips](../usage/performance-tips/) and [Troubleshooting Guide](../usage/troubleshooting/) for debugging performance regressions. | ||||||
See our [Performance Tips]({{< meta usage-performance-tips >}}) and [Troubleshooting Guide]({{< meta usage-troubleshooting >}}) for debugging performance regressions. |
Uh oh!
There was an error while loading. Please reload this page.