I heard that dynamic_cast is not really elegant and that is also slow.
"Elegant" is the most worthless word in a programmer's vocabulary. Strike it from your lexicon.
Every programmer has an utterly different idea of what "elegant" means. If you like the code and it works, why worry about if some armchair engineer on the Internet thinks it's "elegant" or not? It's not their code, it's not their project, it's not their concern.
Regarding whether dynamic_cast is slow: it's not the fastest option but it's a perfectly good one. Much bigger and more complex software than your game manages to use dynamic_cast frequently and still work just fine.
Is there a way to optimaze this code and if so how?
There's always a way to optimize code. Do you have a measured and verified performance problem?
Your GetComponent is not optimal, no. There's a few things you can do to make it faster if you want.
You can cache the component type using a type_index which can be constructed with the typeid operator. This won't work as intended if you're querying for base classes/interfaces, though.
If using the type_index then sort the list and do a binary search instead of a linear search. If you stick with dynamic_cast then sorting wouldn't quite work, though you could cache the results and do binary searches on those. boost::flat_map provides a premade data-structure for this or you can roll you own with a vector and std::lower_bound. You could also use a hash table but C++'s unordered_map is not likely to be faster unless your objects have a very large number of components (hundreds or thousands) for some crazy reason.
Querying a "system" in ECS parlance using an O(1) hash table of game object IDs to component indices would maybe be faster than the binary search... maybe. As with all performance work, measure and find out for sure. It'll depend on a lot of factors.
Your real perf gain will just come from not calling GetComponent more than you have to. Doing work faster will never be as fast as not doing it at all.
You can replace the standard RTTI with a custom system (difficult and only slightly faster) or a simple enum (inflexible) or a string name (slower) or several other options. I've been checking lately whether the default RTTI is fast enough for big AAA games and perf-sensitive apps (at least on PC and the latest consoles) and so far the data I have indicates a strong "Yes."
Is it really a big deal or does it not matter in a small Super Mario sized 2D platformer?
Almost certainly not a big deal. If you want to know for sure, measure your performance and find out. The engines I work with will have entirely different bottlenecks than your code, which will have different bottlenecks than any other engine or game.
Measure.
Then make a hypothesis about anything you can improve, update your code, and measure again and make sure you see the changes you expected. Often you'll get a surprising regression, even if you're doing what some game industry veteran on the Internet told you is the obvious fix.
Measure again.