The latest version of GTF is available as the current from the FTP of Ensembl. You also see many versions and download any of them. When you follow the instruction of a tool, it might mention a specific version number, which is not the latest. Which version should you choose?
I think you can choose any. But that is not the point because nobody can tell which result reflects the true expression levels more.
Take a look at the scatter plots below, comparing expression levels in TPM and counts generated with the GTF version 84 and 99. You see that the expression levels are almost the same for most genes, while some are significantly different only by the version of GTF.

The estimated expression levels are super sensitive to any differences in the algorithm (and its version and execution option settings), the reference genome to map, and the GTF. Imagine that if you mingle data of gene expression data sets, which are produced by different researchers. You cannot expect that they use the precisely same pipeline and reference data.
So, if you are merging data sets, you have to collect all FASTQ files and process them together. Don't merge tables of counts, TPM, FPKM, and so on.
The extracted genes which look like behaving up- or down-regulated are profoundly affected by the pipeline. So you can not trust the individual gene level.

If you compare up lists from 84 and 99 GTFs, some genes are in the intersection, but maybe not almost.

Although the up lists are different, you can expect that the biological conclusion led from the enrichment analysis can be similar due to the nature of omics.

Please examine by yourself about the difference from GTF versions by importing the SSA file of this data set.

Additionally, I compare the difference in data generated with HISAT2 2.1.0 and 2.2.0. The difference is smaller, but still, you know that you should not mingle data from different versions. The SSA is available.