Feature selection (FS) can be considered as a preprocessing activity, wherein, the aim is to identify features having low bias and low variance .
Meanwhile, the primary aim of hyperparameter optimization (HPO) is to automate hyper-parameter tuning process and make it possible for users to apply Machine Learning (ML) models to practical problems effectively . Some important reasons for applying HPO techniques to ML models are as follows :
It reduces the human effort required, since many ML developers spend considerable time tuning the hyper-parameters, especially for large datasets or complex ML algorithms with a large number of hyper-parameters.
It improves the performance of ML models. Many ML hyper-parameters have different optimums to achieve best performance in different datasets or problems.
It makes the models and research more reproducible. Only when the same level of hyper-parameter tuning process is implemented can different ML algorithms be compared fairly; hence, using a same HPO method on different ML algorithms also helps to determine the most suitable ML model for a specific problem.
Given the above difference between the two, I think FS should be first applied followed by HPO for a given algorithm.
 Tsai, C.F., Eberle, W. and Chu, C.Y., 2013. Genetic algorithms in feature and instance selection. Knowledge-Based Systems, 39, pp.240-247.
 M. Kuhn, K. Johnson Applied Predictive Modeling Springer (2013) ISBN: 9781461468493.
 F. Hutter, L. Kotthoff, J. Vanschoren (Eds.), Automatic Machine Learning: Methods, Systems, Challenges, 9783030053185, Springer (2019)