I assume there are really two questions here:
Why does the compiler not pack four bytes into a single Int32
Local variables are not typically optimized for storage, but for speed of access. As the speed of access to a single unaligned byte is sometimes not possible in a single instruction and, until recently (2009), is an order of magnitude slower than aligned address, compiler authors typically use aligned widths as a reasonable tradeoff.
Beyond that, the .Net Framework is spec'd not for x86 but for the Common Language Infrastructure virtual machine. The CLI spec must support the lowest common denominator and IA64 and ARM do not support non QWORD aligned memory access. To that end, the CLI stack "can only store values that are a minimum of 4 bytes wide" (P.330).
Why did they do this? I would imagine the potential or real performance gains outweigh the increase in memory usage. Given the additional limitation of 64 functional locals in any given scope, there should be a strong desire (beyond good design) to keep the number of variables in a given scope small. The net overhead is therefore limited to 192 bytes, which equates to an additional 0.0000002% of memory used in my system.
Keep in mind, if you are accessing an array of bytes, you are really storing a single pointer - which is the width of a memory address (4 or 8 bytes) and accessing memory directly. You are managing the semantics of which byte is which and take on that complexity.
How can I store things in a compact form to minimize memory usage
As you point out, if your data is a large number of bytes, use a byte array to avoid overhead. If your data is of varying types, use one of the many classes which allow for access of packed data (BinaryReader, BinaryWriter, BitConverter, unsafe code, structs with the StructLayout.Pack field set all come to mind).
If you have a ridiculous amount of data, use Memory Mapped Files with fixed layout structs to minimize memory use while still allowing for datasets larger than the amount of memory in the machine. Is it harder than normal memory access? Yes, yes it is - but optimization is a balancing act of managing memory usage, speed, and programmer labor. The cheapest one is typically memory.
Or, spend a couple hundred bucks and get enough ram that it doesn't matter. 32 GB ($240 USD on newegg) allows for quite a bit of not-caring for most situations.