How to set HFSS for accurate DC value?

For most cases, HFSS can get accurate DC result by default settings (see this article), and intentionally checking "solve inside" is NOT suggested (it will cause inaccuracy at high frequency, due to the bad mesh quality during skin-effect region)(see this article). Setting frequency sweep with [Advanced DC Extrapolation] is NOT proposed either (it is hard to guarantee passivity/causality).
In HFSS 3D (2021R2 new) and HFSS 3D Layout, checking [Use Q3D to solve DC point] in the sweep setting by default is suggested for accurate DC. In addition, to check [Enhanced low frequency accuracy] (2021R2 new, only support Lump/Circuit port so far) in solution options of setup also helps low frequency accuracy.(see this article)

Why are the result of Q3D and HFSS inconsistent?

Consistent results can be achieved (see this article or video), but two conditions have to be met first:
-- The bandwidth must be less than 1/10 wavelength (at low frequency), because Q3D is quasi-static solver, however HFSS is full-wave solver.
-- Q3D has to solve return path in addition, and uses reduce matrix [Add return path] post-process to get loop RL compared with HFSS.

What's the reason causing a circuit simulation non-convergent? How to deal with it?

Most reasons that cause circuit simulation not convergent is passivity or causality violation of extracted S-parameter model. It is not good to use the "enforce" function provided by simulation tool or to translate a S-parameter into a SPICE model directly. The simulation results repaired by enforcement are not necessarily correct (the numerical repair satisfies the passivity and causality, but also affects the S-parameter). The latter way is not a clever method either, due to changing it from a broadband model to a narrow-band model.
The best approach is to look back settings of the original model. Usually the material settings, mesh quality, and frequency sweep settings are the main effects. (see this article or video)

Why is the antenna radiation efficiency of HFSS always higher than measurement?

The general antenna radiation efficiency measured, which is like 1W into the transmitting end, and then see how much energy (e.g. 0.9W) received at the receiving end, the ratio (0.9/1). However, the radiation efficiency defined by HFSS is like 1W forced into the antenna, the actual transmited energy is less (e.g. 0.95W) and the receiving end receives energy (e.g. 0.9W), the ratio (0.9/0.95). That is, the radiation efficiency defined by HFSS does not take into account the reflection loss at the transmitting (feeding) end.

If the air box is not set a proper size, it may cause the energy not completely radiated out of outer boundary. The Radition boundary is suggestted as 1/4 wavelength away from the object, and the PML boundary is 1/8 wavelength. In addition, you can draw an inner space as a mesh seed, or set a mesh operation on the background space to improve the mesh quality (it can solve the problem of radiation efficiency >1 obviously).

To speed up simulation, CPU or GUP which should we choose? Is SSD necessary?

-- For software, purchase HPC option; for hardware, purchase strong CPU (2.3G at least) and DDR4 (RAM is large enough, 256G may be necessary for 5G array antenna).
-- Choose CPU instead of GPU, the later is good at floating-point operation, and only helpful for HFSS time domain/transient solver and some large antenna simulations.
-- After purchasing CPU and DDR4, in case you have spare for budget, purchase 1T SSD for C disk as well.

Why are the HFSS wave port and the lump port sometimes different? Which is more accurate?

Characteristics of the two ports are different. If the frequency is not high and the lump port is short, it is easy to get consistent results between them. On the contrary, if the frequency is high and the lump port is long, the result will be different (even if lump port de-embed has been implemented). That is, the key is not which technology is more accurate, but whether you know its respective applicable conditions. (see lump port, wave port)

Looking at TDR in HFSS, or in Designer with S parameters extracted from HFSS, why are the results different?

The technique used in HFSS and Designer to calculate TDR is different. If you would like to use HFSS to directly view TDR, the effective bandwidth of model extraction should be 3~4 times of the original. (see this article).

For HFSS, why does "solver inside" sometimes cause different results? When do we need to set "solver inside"?

"Solver inside" which works at high frequency skin-effect area (conductor thickness>>skin depth), usually can't achieve sufficient accuracy due to bad mesh quality, even if you have made more refine mesh, reduced the delta S or added the mesh operation as skin deepth. But for some simple structures, or transmission lines with the wave port de-embed, they can get the same result. (see this article)
Generally, "solver inside" is considered only when to look at the current/field in the conductor, to verify the HFSS DC accuracy, to concern the Q accuracy of the inductor, or to perform S.E. analysis on the shielded chassis. The last one prefers to use two-side shell element as well.

Regardless of a S-parameter or SPICE PCB model, why does an uA leakage current appear between power and ground with a DC/AC power spanning?

Strictly speaking, this is caused from the dielectric loss rather than the "leakage current" of the PCB. That is, the loss of this energy is from the FR4 Df is NOT 0 (generally 0.02~0.016).
-- Do not set the Df to 0 (lossless material), only in order to eliminate this current. It is wrong, and may lead the broadband model to violate passivity or causality.
-- The currents across the same voltage at a certain frequency with S-parameter/DC SPICE/1M SPICE/1G SPICE models are different. That is why the SI+PI simulation uses a S-parameter rather than a RLC SPICE model.
-- In a RLC SPICE model, you will see an RG term that spans between the power and the ground. In fact, it is the reciprocal of G, and unit is ohm, which is usually Meg ohm scale.

If you would like to evaluate the effect of return path, or different reference points at both ends of a transmission line, how to add ports for the GND net in the HFSS? (Where is the reference for GND ports?)

The ground (return path) effect can't be separated alone from a S parameter, but it did be completely contained / considered in the S parameter.Taking a two-port S parameter of .s2p, even if the IO model connected to both ends of Tx/Rx refers to net 0, it does not mean that the reference ground at both ends of this S-parameter is directly short-circuited and the parasitic effect of return path is neglected. This concept is very important, but it is difficult to explain the operation mechanism of the circuit model behind it. If you want to know, please contact ANSYS.
Due do the lack of understanding above facts, some people like to extract HFSS model with "GND ports". To achieve this, adding an additional ideal reference plane (perfect-E) below the entire model becomes necessary. It will introduce extra capacitance parasitics, extra loop inductance and increase the length of the lump ports. In addition, even the normal .s2p (two ports, no additional ports for the ground net), when the S parameters are imported into Designer circuit, you can also choose to display the respective reference points of each port individually. That is, a two port S parameter model can be represented by a 4-pin circuit symbol, which allows the user to connect the corresponding reference ground at the circuit level.
Therefore, it is not recommended to assign ports for GND/reference net, and it did be unnecessary either. But many people like to do it without true understanding. Facepalm...

For power integrity simulation with 0~1GHz accuracy requirements (such as SI+PI co-simulation sign-off), the use of S-parameters is the best, not only to ensure DC accuracy, but also ensure that the model behavior of 10M, 100M and 1GHz are accurate. The only limitation is that it is impossible to separate the noise contributed by the power supply and the ground path, only the overall loop effect of the power supply + ground path can be seen. For those who want to know whether the bottleneck of poor PI in a certain case is caused by poor power or ground layout design, what should they do? You can use the 10M SPICE model for this purpose, but because the SPICE model is single frequency and lumped model that is not good enough for SI+PI co-simulation sign-off.