Please correct here:: Getting different ids for Array imported from multiptocesssor module

Fenil · August 7, 2023, 10:29pm

This machine has 4 CPUs
Parent process puts item on queue with id 4480832448
Child process received var = 1 with id 4430553984 from queue
Parent process changed the enqueued item to 101
After changes by parent process, child process sees var as = 101

According to research and chatgpt response

To clarify, when using the Array from the multiprocessing module, the memory regions are indeed separate between processes. This is different from the behavior of using the Value object, which shares memory between processes.

Course: https://www.educative.io/courses/python-concurrency-for-senior-engineering-interviews
Lesson: https://www.educative.io/courses/python-concurrency-for-senior-engineering-interviews/xoKzopOmA4B

Javeria_Tariq · August 8, 2023, 4:09am

Hi @Fenil !!
In the example,

from multiprocessing import Process, Semaphore, Array
import multiprocessing

def child_process(sem1, sem2, arr):
    print("Child process received var = {0} with id {1} from queue".format(str(arr[0]), id(arr)), flush=True)
    sem1.release()
    sem2.acquire()

    print("After changes by parent process, child process sees var as = {0}".format(arr[0]), flush=True)


if __name__ == '__main__':
    sem1 = Semaphore(0)
    sem2 = Semaphore(0)
    print("This machine has {0} CPUs".format(str(multiprocessing.cpu_count())))

    arr = Array('i', range(5))
    print("Parent process puts item on queue with id " + str(id(arr)))

    process = Process(target=child_process, args=(sem1, sem2, arr))
    process.start()

    sem1.acquire()

    # change var and verify the change is reflected in the child process
    arr[0] += 100
    print("Parent process changed the enqueued item to " + str(arr[0]), flush=True)
    sem2.release()
    process.join()

when we create an Array using the multiprocessing module, we’re creating a shared memory region that can be accessed and modified by multiple processes. However, the behavior you’re observing with the different IDs is not indicative of whether the memory is shared or not.

When you create an Array and print its ID using id(arr), you’re printing the ID of the Array object itself, not the ID of the shared memory region it represents. Even though the printed IDs are different, this does not mean that the memory is not shared between processes.

In fact, the memory underlying the Array is indeed shared between processes, allowing them to see and modify the same data. The different IDs you see are a result of the way Python handles memory and object identities.

Here’s the relevant part of the code:

arr = Array('i', range(5))
print("Parent process puts item on queue with id " + str(id(arr)))

process = Process(target=child_process, args=(sem1, sem2, arr))

In this code, the ID you’re printing is the ID of the Array object arr, not the ID of the shared memory region it represents. The Array object itself is distinct in each process, but the memory it refers to is shared.
Remember that the key takeaway is that the memory regions underlying the Array are indeed shared between processes, allowing them to work with the same data, even though the IDs of the Array objects themselves might be different in each process.
I hope it helps. Happy Learning