4

I'm trying to create a pure memory intensive script in Python for testing purposes but every script that I try also increases my cpu. I've read this post and I also tried, among others:

#!/usr/bin/python
from datetime import datetime
startTime = datetime.now()

l1 = [17]*900
l2=[]

j=0
while j<9000:
    l2=l1
    j=j+1
print "Finished in ", datetime.now() - startTime

in order to copy an array to another array but once again I had cpu variations as well.

UPDATED So, how can I cause a standard cpu utilization (100% usage in one core), 45% of memory utilization and after a couple of minutes an increase of memory utilization to 90%?

9
  • If the memory is not used (i.e. CPU), it will likely not be in RAM... Commented Jun 17, 2016 at 18:42
  • You can certainly allocate memory and not use it, but that's not very helpful for performance testing, because unused memory gets swapped out. If it's not even written to once, then it's purely virtual memory being allocated, not even physical. Commented Jun 17, 2016 at 18:42
  • ...anyhow, if your goal is to minimize CPU resource utilization, Python isn't the right tool for this particular task -- however efficient you make your Python script for the purpose, one will be able to do significantly better in C. Commented Jun 17, 2016 at 18:43
  • Just do x = range(10**9)? Short CPU-spike, lasting high memory-usage. Commented Jun 17, 2016 at 19:04
  • If your python script is single threaded, it's not going to incur more than 100% cpu. Commented Jun 17, 2016 at 19:32

1 Answer 1

4

You have a couple of misconceptions that I'll try to address.

  • You have to use CPU to use memory. There's no other way.
  • Your copy of a list is only assigning a pointer. You're not moving memory.

If you want to increase memory utilization, you need to keep adding data to your list:

l = []
for i in range(0, 1024*1024):
    l.append("*" * 1024)

Or using something similar to your method,

l = [17] * 1024

for i in range(0, 16):
   l = l + l  # doubles the list each time.

That will allocate the memory. If you want to measure access to it in isolation, you'll want to loop over l modifying the values or summing them.

sum(l)

or

for i in range(0, len(l)):
    l[i] += 1

In the end, your benchmark is going to be very simplistic (like doesn't address multiple cores accessing memory simultaneously, doesn't take into account processor caches, lookahead, random vs serial access, etc.) Using Python is also not optimal because you are not in full control of the memory allocation and garbage collection.

Proper memory benchmarking is a deep subject...

Edit:

This is what you are asking for, more or less:

from datetime import datetime
from datetime import timedelta

memory1 = "*" * 1024**3

start = datetime.now()

j = 0

while (datetime.now() - start) < timedelta(minutes=1):
    j += 1

memory2 = "*" * 1024**3

while (datetime.now() - start) < timedelta(minutes=2):
    j += 1

You can adjust memory1 and memory2 to get your 40% and 90% depending on your actual system size. The program will need to use the CPU while it allocates the string. It first has to request the memory from the kernel, but then has to fill it in with '*', otherwise the memory will only be virtual. If you were writing this in C, you could just touch one byte in each 4k page.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for your reply, I've updated my question. Your script increases both cpu and memory util.
Thank you @rrauenza You can change the second timedelta to 2 minutes in order to have the double memory utilization for the same time. Thank you again!!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.