>disabling JavaScript




Mon  Eat world tree and lie  all day
Tue  Glory to Tyr  all day
Wed  Oh mighty Wotan  all day
Thu  Quarta-feira  all day
Fri  Take it easy  all day
Sat  Filter Friday  01:00
Sun  Take it easy  all day
works best with

/bun/ - #14322

Bottom Expand Images
GIqSnDsEx(234 KB, 768x1024)113790956_p0.jpg
VIPPER
>>20137
VIPPER
Oyasumibutts
Nameless
have you tried the new gefs
aaa i want a raspberry to run 9front
>>21728
VIPPER
>>21725
I have it on my THINKPAD wwww
seems ok, prefer cwfs for the big fucking server though
>>21726
whoa, GUMI
where's that fucking nerd GUMI cosplayer
Anonymous
gumi cutie
VIPPER
Who homu
VIPPER
program doesn't work at all but technically not a supported OS release
thank you theo
VIPPER
kill everyone who has ever written an ifdef
VIPPER
plop tuah
VIPPER
Kill every horse
VIPPER
horse genocide now
VIPPER
Oyasumi bunbune
Anonymous
noooo vip
VIPPER
Awaken burgfigerd
VIPPER
Borgfrogor
Anonymous
borgar
VIPPER
Trajodie of the plopping
VIPPER
#plop (42)
VIPPER
Nite busts
VIPPER
Ploppy morning
VIPPER
Synced
Not Cenk'd
VIPPER
What keeps plopkind alive
VIPPER
nitebuns
VIPPER
videogames
VIPPER
now ai
deogames
VIPPER
Destroy horse
Anonymous
hey vip I got a programming question for you
I'm struggling to make a parallel code base run well. I'm not sure how to debug things, so, here we go:
1. I have a n algorithm with 20 MB of RAM required to run on ~2 MB of data.
2. I have 112 CPU cores.
The sets of 2 MB of data are.. basically infinite. Like, I'll have 2000 of these chunks of data and I need to run the algorithm on it 2000 times.
The issue I get is:
Parallelizing the algorithm where I copy the 20 MB onto each core and running on the 2 MB of data requires a massive overhead and it's incredibly slow.
If I put the algorithm in shared memory, this cleans up the overhead a bit, but I run into issues where the chunks of data aren't perfectly equal, so some finish earlier than others
And then I have a join() call where one core is just chugging along. This is exacerbated when I split up the algorithms into its own set of cores - for instance, I copy the algorithm 1 core and run 4 sets of data, do this over 112 cores.

Basically -- do you have other ideas on what I can try when working with a hefty algorithm + constant stream of data? I have plenty of CPUs, I just don't know how to utilize them properly.
Anonymous
It's a class that contains a bank of filters
The 20 MB filters are applied to the data and then try to classify the data
So I was copying this class across all my different processes, but when the processes were attached across all the CPUs, I guess the context switching and copying over the data onto each CPU was too much and it went from a 2 minute execution to a 6 hour execution.
When I limit to only 2 CPUs via taskset -c $1 and run it a bunch of times it goes super fast.

Would it be that, because I have, say, 28 processes, it's shoving 20*28 MB into cache?
Sorry, what is MPI?
Well, most likely, no. Ah, no.
I mean I don't wanna say what it is cuz you'll immediately just say "well that's your problem" and end it there lol
But it's Python and honestly I've just been doing "run code & run code1 & run code2 & run code3 &" in bash so it has different processes.
Er:
taskset -c 1,2,3,4 python main.py & taskset -c 5,6,7,8 python main1.py & taskset -c 9,10,11,12 python main2.py & -- etc
When I look at the system-monitor, it seems to properly use those CPUs I've set affinity for
Oh! Speaking of - the thing I'm trying to do is basically set CPU affinity in my program itself. So when I spawn a process and immediately set the CPU affinity to just 1 CPU, but it started at, say, 112 CPUs.. does it immediately copy over every single instance of my class to the 112 CPUs?
I was worried I'm copying data over when I shouldn't be.
I do have it --- okay.. uh, would you be to determine if -- ah hold on
>>21797
Anonymous
Is there a way for me to track if I am loading 20 MB into each core or if it's simply pulling from shared memory properly?
I'm simply struggling at what I need to look up on the internet for these things.
www ah alright alright. I do have about half of my code written in C and then I call the compiled C via Python, maybe I can just slap a memory util in there as well.
Mmmm you're smart thanks I gotta do that

In Python <--> C, Python has a ctypes library where you point to a shared library and need to initialize your data types on input and output, then you can call the shared library function in Python.
>>21797
VIPPER
Well that was retarded, uh, be careful and if I see your posts get got I’ll unban you www
Anonymous
??
Did you get yourself auto-banned? www
>>21803
Anonymous
Oh fuck your message is gone, I refreshed the tab, dammit
>>21803
huh
VIPPER
wwwwww could always bake the filters into the C part's rodata if you want to have FUN with the good old ar(1)
your executable will be fuckhueg but it’ll get copied and do lookups free
Anonymous
I am glad I have something to push forward with now though, cuz I was absolutely stumped.
Thanks!
VIPPER
wow
Zcopy access in python looks like a PITA
amazing, i love programming
VIPPER
oyasumi plopers
VIPPER
Wake the fuck up nugs
VIPPER
end of horse
VIPPER
Nite reads
VIPPER
Mrplopbeast
Nameless
aaaa
aaaaaaaa
Nameless
im having vivid halucinations or flashbacks or whatever of me having SEX
aaaa
>>21827
VIPPER
>>21826
waow
wow
SEX
imagine, you could be having SEX
RIGHT NOW
if you just got on public transit wwww
VIPPER
some like it plop
VIPPER
Tellolism
VIPPER
#plop (43)
Anonymous
>>21830
the biggest of nerds
los nerdos
Anonymous
nerds, kurds, and turds
VIPPER
Plop tuah

Return TopLocked to bottom