It's best not to think of
reg [1022:0] my_array[1022:0];
as a 1023 by 1023 array; it's best to think of it as an array of 1023 1023-bit numbers. This is a true 1023 by 1023 array:
reg my_array[1022:0][1022:0];
SystemVerilog generalises arrays and introduces some new terminology that is useful when talking about Verilog. (Verilog is only an unofficial subset of SystemVerilog, anyway.) In SystemVerilog, you declare arrays like this
<type> <packed dimensions> <name> <unpacked dimensions>
So, the equivalent of your original declaration would be
logic [1022:0] my_array [1022:0];
So, the first [1022:0] is a packed dimension and the second [1022:0] is an unpacked dimension. In SystemVerilog, you can have as many packed dimensions as you like and as many unpacked dimensions as you like. In Verilog, you can have as many unpacked dimensions as you like, but you can only have one unpacked dimension.
The rules about accessing packed dimensions are different to the rules about accessing unpacked dimensions. Basically, the rules for accessing unpacked dimensions are much more strict. (This is why you perceive your difference in behaviour between rows and columns.) Basically, for unpacked dimensions, you can either
- access a single element or
- access the whole array at once.
That's it. So, when you're dealing with the unpacked dimension, you're going to have to use some kind of loop to do what you want to do. And, if you are really working with an array of 1023 by 1023 single-bit numbers, then, given it's always going to be more work to access the unpacked dimensions, you might want to consider making both dimensions unpacked and using loops for both dimensions.