Advanced search

Message boards : Multicore CPUs : VM apps

Author Message
Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 33874 - Posted: 14 Nov 2013 | 10:04:19 UTC

Hi all,

I know a few BOINC projects deploy apps in virtual machines. Are any of you crunching for any of those? Would be obliged if you could you give me a quick run-down of the common problems, please.

Cheers

Matt

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,370,904,288
RAC: 2,409,267
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33877 - Posted: 14 Nov 2013 | 14:57:30 UTC

I have been crunching on Test4Theory for just over 2 years. It is a project that does work for CERN and the large hadron collider.

The theory is that by using a virtual machine, they are only required to get their application to work on one operating system.

The setup requires a host operating system (either Windows, OSX or Linux), Virtualbox, a guest operating system (CERN uses their own flavor of LINUX), BOINC, and the scientific application. And that my friend is an awful lot of moving parts to try to keep working together.

Any time there is an upgrade in any of the components, there is a chance of compatibility problems. Most of the time, problems occur when Oracle "improves" Virtualbox and the "improvements" break something in the BOINC -> Virtualbox -> guest O/S interface.

Another area of problems is getting permissions set up correctly so that Virtualbox and the guest O/S have the permissions that they need to function properly, especially in Linux. That may have improved some with the latest release of the Virtualbox wrapper, but caution is still advised.

If you cruise on over to the Test4Theory web site, you will see that they have a relatively low level of participation. It is probably because it is challenging to set up the BOINC/Virtualbox environment and keep up with all the upgrades and problem resolutions.

You might want to spend some time browsing the message boards at Test4Theory and get a feel for they challenges that they face.

If you want more information from me, let me know.

Hope that helps,
CaptianJack

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33878 - Posted: 14 Nov 2013 | 15:10:49 UTC - in response to Message 33874.

The Test4Theory project has been one of the projects at the forefront of developing the "app in a VM" approach. Working closely with Rom Walton and Oracle, they've eliminated most of the technical problems though I think they haven't quite finished recompiling and bundling all the fixes together and releasing a new vboxwrapper. Jacob Klein is more familiar with that than I am so I'll leave that topic for him. I would rather describe to you what I think has been the source of past problems and may well be the source of future problems.

In the traditional model BOINC communicates with and controls the project app directly. By "app control" I mean suspending, resuming, stopping, starting the app. When the app is in a VM the traditional BOINC <-> app comms lines don't exist because the app is sandboxed in the VM. App control is achieved through the vboxwrapper developed mostly by Rom Walton. Vboxwrapper calls vbox-manager (a convenience app Oracle ships with the VBox package) with appropriate parms, vbox-manager controls the VM, and that is mostly where past problems have occurred. Bugs in either vbox-manager or the VM itself or perhaps Rom's use of vbox-manager, have caused sporadic comms "disconnects" between BOINC client and vbox-manager which have resulted in, among other things, the VM not stopping/suspending when it should. As I mentioned above, that problem as well as others appear to have been eliminated recently. There may be other problems I am not aware of.

Today the main deficiency, though it may never cause a problem that becomes apparent to the user, is that controlling the VM via vbox-manager is like controlling BOINC client via boinccmd. It works but it's clunky, for lack of a technical term. Oracle has developed an API for the control interface which vboxwrapper could use (but does not) to control the VM directly. Perhaps one day BOINC client itself will use the API and eliminate the wrapper.

____________
BOINC <<--- credit whores, pedants, alien hunters

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,370,904,288
RAC: 2,409,267
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33879 - Posted: 14 Nov 2013 | 15:35:12 UTC

Here is another factor to consider, not a problem per se, but just something to think about. When running a BOINC task under a VM, each task that runs requires a virtual machine, a copy of the guest O/S, a copy of the scientific application and all the data that is required to perform all the calculations. That all amounts to quite a bit of memory that is required to run a task.

The Test4Theory task that I am running right now in Linux is using 348MiB of memory.

Test4Theory is currently set up to only run 1 task per computer. If a project/person wants to run more than one task per machine, memory requirements will go up proportionally.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 33881 - Posted: 14 Nov 2013 | 16:30:03 UTC

Thanks guys, all useful info.

On a practical level, do I take it that the user needs to download and install Virtualbox separately, or is there a combined BOINC+VM installer?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33884 - Posted: 14 Nov 2013 | 18:24:43 UTC - in response to Message 33881.
Last modified: 14 Nov 2013 | 18:55:25 UTC

http://boinc.berkeley.edu/download_all.php

Read the BOINC v7.2.28 has been released! thread.

So far as I'm aware, there are no GPU projects using a VM.

If you google Boinc, VM and VirtualBox you should find some details, including publications/development work.

- RNA World also have an app; cmsearch VM (VirtualBox) 1.0.2
They use a 782MB virtual disk image (vdi), rnaWorld2GB_x64_2.vdi
and the wrapper, vboxwrapper_26028_windows_x86_64.
RNA World use the VM as it allows them to checkpoint (in a way). Some/most of their tasks take months (~4moths on high end processors).
Problems include the VM not being removed when a WU fails (would use up hard drive space).

http://www.rechenkraft.net/phpBB/viewtopic.php?f=75&t=13127
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33885 - Posted: 14 Nov 2013 | 20:25:58 UTC - in response to Message 33884.

Problems include the VM not being removed when a WU fails (would use up hard drive space).


That has been a problem at Test4Theory (T4T) as well. It's been a while since I played with vbox-manager but if memory serves correctly it is possible to query vbox-manager to see if there is any garbage laying about and follow with commands to clean it up. If devs at projects employing VBox are not doing that then it might be out of concern they might receive bad intel from the query and delete some other VM running on the host.

As far as VBox not virtualizing the GPU... maybe someone needs to ask Oracle if they could add that capability. Maybe it's just a matter of Oracle not perceiving a demand for it? If so then perhaps that perception can be changed?

____________
BOINC <<--- credit whores, pedants, alien hunters

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33889 - Posted: 15 Nov 2013 | 0:53:18 UTC
Last modified: 15 Nov 2013 | 1:32:12 UTC

I'll chime in.

I've only been doing VM tasks for about 3-4 weeks. I wish I had been doing it earlier, since I'm a BOINC alpha tester (and could have helped find/fix the problems), but alas, for whatever reason, I didn't. But now, I am doing work for Test4Theory, RNA World, and Climate@Home (when they have work).

The main problems I've seen have been:

1) Oracle problems resuming from snapshots. 4.2.16 was great, and didn't have problems, but 4.2.18 introduced a bug, 4.3.0 fixed it but introduced another, and 4.3.2 [current] fixed it but introduced another. The most recent unfixed bug does not affect RNA World, but does affect T4T. Details can be found at these 2 links:
http://lhcathome2.cern.ch/test4theory/forum_thread.php?id=1364&postid=15560
https://www.virtualbox.org/ticket/12291
I expect Oracle to get this resolved in the next maintenance release, so VirtualBox itself doesn't fail BOINC tasks like it does now when resuming T4T tasks.

2) Any BOINC applications that use the VM, go through that vboxwrapper. It too has had some bugs that I've worked with Rom to solve. Most recently, we found a heartbeat bug where tasks for applications compiled with API code within the past few months, when ran on Windows service installs, would simply fail due to a problem with the heartbeat check. It's fixed now, but it requires a newer vboxwrapper, which means a newer application version. Also, there were some issues with trickle messages not working properly, but I believe that's fixed too. Long story short: The vboxwrapper is not quite matured yet, but it is maturing, and I'd recommend any application that uses it to have a team that is "nimble" enough to easily update to newer application versions.

3) Communication/control doesn't always work. It works almost all the time, but during BOINC upgrades and a few other scenarios, the VMs are not paused appropriately. Then when BOINC later tries to "attain a lock" to control it, the lock can fail, and it causes problems. Rom is trying to get better control on this with the wrapper, but it is an ongoing issue that does occur.

4) VirtualBox being installed. This is a problem for some. For me, I encounter network problems during LAN parties, whenever VirtualBox is installed (because it does something with some "VirtualBox Host-Only Network" network adapter that can cause problems when gaming or sharing files.) In general, it works just fine, but it does monkey with the network, and that right there is a reason that I don't install it on my wife's laptop, because I cannot risk her having any sort of crazy network issue when she's away at the office doing work. And so that's a "showstopper" for some machines/users.

5) VM Management. If you are in a non-service-install, it's neat that you can open the "VirtualBox Manager" program to see BOINC managing (pausing, snapshotting, resuming, shutting off) your VMs. However, for service-installs, you cannot. So, you lose out on seeing that info. Not a big deal, but it is a problem, in the sense that the user can't ever manually see/control/cleanup the VMs. Note: T4T actually uses the "Remote Display" capabilities of VirtualBox, and the BOINC client has support for it too, such that in BOINC I can click a "Show VM Console" button, that uses Window Remote Desktop to actually connect to the VM and show me some text. It's neat. :)

6) Cleanup. Most times, when a task fails, BOINC uses the vboxwrapper to clean the slot up. But sometimes it doesn't do that. (I haven't thoroughly tested) But Rom told me that, the next time that application runs in that slot, it'll do a cleanup before copying data into the slot. I find that a bit sloppy, especially since we're dealing with big files, but it is what it is.

7) Memory. The RNA World uses a 64-bit VM with a Base Memory of 4 GB. So, BOINC will make sure your system is allowed to use 4 GB before giving you the task. But, while that task is running, because Windows doesn't have an effective way to see how much memory the VM is using, and because the VirtualBox manual states that the host shall grant the Base Memory amount as free memory and not expect to use any of it... We recently put in a client fix (made it into 7.2.28) where, when any VM app is running, the <rsc_memory_bound> value will be used as the "this amount of memory is allocated for this running app" value. Non-VM apps still query the Windows processing list to get that information. So... If you want to run an RNA World app, BOINC [correctly] considers 4 GB memory as "used", even if the client OS is using a fraction of that. Thus, it's crucial to set "Base Memory" to an appropriate value, and to make sure <rsc_memory_bound> matches it, and for those projects that "might use x amount", BOINC will consider "x amount used" while the task is running.

8) Host responsiveness. I have found that, when running 2 VMs (that use the "Enable VT-x/ADM-v" hardware acceleration option) at the same time, my music sometimes skips, my mouse cursor's acceleration isn't always registered properly, so my cursor skips around when I move it, the UI sometimes refreshes panes/grids slowly, and my machine is difficult to control. We're still looking into this issue, but I'm pretty sure it's related to that hardware acceleration option, and perhaps Intel doesn't handle multiple VMs too well on my 2009 i7 965XE CPU. But it is an issue. So, I make sure to only run 1 VM at a time, by manually suspending tasks, until I go to bed, and then I run multiples.

Whew, I'm beat. I'm sure there's more "problems" out there, but that's all I can think of off the top of my head. I'm still cutting my teeth on this VM stuff, but am pressing T4T admins, RNA World admins, and Oracle devs, to all fix the bugs so it all runs smoother. You'll see my name on a lot of posts, mainly because I care. Eventually, BOINC devs may ONLY offer the "BOINC+VirtualBox" installer; they want users to be capable of running these new VirtualBox tasks. And I get that. But, at the same time, we're still working out the kinks.

By the way, I should mention that, for RNA World, their VirtualBox application is a blessing! I restart my machines often, and so for the longest time (years?), I refused to run their tasks, because they could take 2-60 days, with no checkpointing! But that issue is solved! So, right now, I've got one of their VM tasks that is estimated to take 60 days, I've been working on it for 3 weeks, I'm on "CPU-processing-day 11", and so far it is working well! I've restarted numerous times, and it resumes form snapshot just fine. Here's a little more info on how it works: I believe that, if the user has "save to disk" at a value that is less than 10 minutes, vboxwrapper takes a snapshot every 10 minutes (since a lot of data may need to be saved to store the snapshot which is a "dynamic difference disk"). So, any time I need to restart my PC, I only lose up-to-10 minutes of work. That's AWESOME! That means I can actually finish this task that is estimated to take 8 weeks! (I'm excited).

Hope you find this useful!
- Jacob

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33891 - Posted: 15 Nov 2013 | 3:50:51 UTC - in response to Message 33889.
Last modified: 15 Nov 2013 | 3:55:58 UTC

Jacob Klein wrote:
8) Host responsiveness. I have found that, when running 2 VMs (that use the "Enable VT-x/ADM-v" hardware acceleration option) at the same time, my music sometimes skips, my mouse cursor's acceleration isn't always registered properly, so my cursor skips around when I move it, the UI sometimes refreshes panes/grids slowly, and my machine is difficult to control. We're still looking into this issue, but I'm pretty sure it's related to that hardware acceleration option, and perhaps Intel doesn't handle multiple VMs too well on my 2009 i7 965XE CPU. But it is an issue. So, I make sure to only run 1 VM at a time, by manually suspending tasks, until I go to bed, and then I run multiples.


This not a shameless Linux plug. My intention is more to applaud VirtualBox than to praise Linux. If Windows were as configurable as Linux I believe one could get results on Windows just as impressive as I am about to describe.

On my i7-2600K, no OC, with VT-x enabled, I was able to run simultaneaously:

- the 1 VM associated with the T4T BOINC task
- plus 6 other VMs I created, each with BOINC installed and running T4T which of course means a T4T VM in each of those VMs (nested VMs)

That's a total of 13 VMs running simultaneously, all Linux. That was with the old cernwrapper which was as bare bones as it could be, less than 200 lines of C, IIRC. It didn't do any snapshots which reduces load on vbox-manager somewhat... but... the system ran almost flawlessly, with very little cursor jerking and slow acceleration. VMs started, stopped, paused and resumed flawlessly, only 1 in about 30 VMs had to be cleaned up manually. I could interrupt any of the VMs at any time and it would resume no problem. It survived power outages with no problem. Then I turned off all the eye candy in the desktop and switched the OS installed in the first tier VMs from 64-bit to 32-bit and that restored responsiveness to normal. Oh, I adjusted some process priorities (nice levels) too and that actually helped a lot. After I had it all tuned she was a beast of rare beauty! I was thoroughly amazed and very impressed with VirtualBox.

Rom's vboxwrapper changed all that for me. As the size and complexity of the wrapper grew, the performance of my i7 system with 13 VMs running deteriorated considerably and I had to cut back to just the 1 VM running on the real machine. No ill will toward Rom as I was always amazed that I could run even 3 VMs simultaneously let alone 6 nested VMs plus a normal one. I always considered it flying too close to the sun and new I would plummet back to Earth one day.

On the positive side, and this is a big positive, Rom's wrapper fixed a lot of problems for Windows users, problems that made it impossible for quite a few of them to run T4T.
____________
BOINC <<--- credit whores, pedants, alien hunters

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33892 - Posted: 15 Nov 2013 | 4:00:57 UTC - in response to Message 33891.
Last modified: 15 Nov 2013 | 4:05:53 UTC

Just out of curiousity, when you were running multiple VMs, are you ABSOLUTELY POSITIVE that multiple running VMs had the "Enable VT-x/ADM-v" option checked? The reason I ask is that, right now, T4T uses a version of the wrapper where it leaves that option unchecked. (T4T uses a single-CPU 32-bit VM, and the old version of the wrapper left the option unchecked in that scenario, and on side note, it suffers from the server-install heartbeat problem as well. T4T admins are looking to upgrade to latest wrapper in beta, sometime soon.)

... and I've only noticed the OS responsiveness issues when multiple tasks with "Enable VT-x/ADM-v" were running. (For T4T, I'm currently using anonymous platform with the newest version of the wrapper, for testing purposes, which checks/enables that option even for single-CPU 32-bit tasks, which is why I'm having the issue currently.)

So, again... are you sure your T4T tasks were using it?

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 33893 - Posted: 15 Nov 2013 | 9:11:53 UTC - in response to Message 33889.

Jacob,

Thanks very much for taking the time to write such a comprehensive summary. It's very useful!
One question - am I right to understand that the VM image is supplied in the same way that the application is for a standard BOINC project? That's to say that it is downloaded once and re-used and shared for all WUs?

Matt

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33894 - Posted: 15 Nov 2013 | 10:18:57 UTC - in response to Message 33892.

So, again... are you sure your T4T tasks were using it?


Yes, they were. That was over a year and several VBox updates ago so keep that in mind.
____________
BOINC <<--- credit whores, pedants, alien hunters

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33895 - Posted: 15 Nov 2013 | 10:35:27 UTC - in response to Message 33893.

am I right to understand that the VM image is supplied in the same way that the application is for a standard BOINC project? That's to say that it is downloaded once and re-used and shared for all WUs?


Yes.

____________
BOINC <<--- credit whores, pedants, alien hunters

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33896 - Posted: 15 Nov 2013 | 14:09:40 UTC - in response to Message 33893.
Last modified: 15 Nov 2013 | 14:10:10 UTC

Jacob,

Thanks very much for taking the time to write such a comprehensive summary. It's very useful!
One question - am I right to understand that the VM image is supplied in the same way that the application is for a standard BOINC project? That's to say that it is downloaded once and re-used and shared for all WUs?

Matt


You're welcome.
Regarding the VM image, yes, I believe that is the case; it is only downloaded once.
My turn to ask a question: Why are you gathering this information - do you have any sort of plans for a VM app or maybe a new VM project?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33899 - Posted: 15 Nov 2013 | 21:01:34 UTC - in response to Message 33896.

Matt said he wanted to make a VM image weeks ago...
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33900 - Posted: 15 Nov 2013 | 21:03:07 UTC - in response to Message 33899.
Last modified: 15 Nov 2013 | 21:20:49 UTC

Ah, thanks, I must have missed that.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33911 - Posted: 16 Nov 2013 | 17:55:50 UTC

I assume this is meant for a possible multi core app. But: if the GPU could be used under Virtual Box without a performance hit, this could provide a much more stable GPU-Grid environment! You could reduce the software complexity to just one driver under one OS, which could help a lot. Or, if running virtualized caused a performance hit (which I expect to happen, if possible at all), this could be used as an optional "reference platform", which people could use when in doubt about their system and software, or when they can't get it to work otherwise.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33912 - Posted: 16 Nov 2013 | 20:00:56 UTC - in response to Message 33911.
Last modified: 16 Nov 2013 | 20:20:17 UTC

A lot of projects don't behave themselves when it comes to task settings (priority, GFlops estimate, estimated runtime, deadline...). So when work arrives for projects like Physics, GPU work stops and Physics tasks run on the CPU. I find this very annoying to say the least. This is one reason I use Linux virtual images to run CPU work, but I install Boinc within the VM.

Last time I tried to get the drivers to work for my GPU's I forgot I didn't have the monitor plugged in, and ended up at a black screen with an X for an arrow. It might be easier to go headless but I might have to build an image outside of the VM and then import it, some how. Next time I try the drivers I'll remember to clone first and plug in a monitor!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33914 - Posted: 17 Nov 2013 | 1:50:24 UTC - in response to Message 33912.
Last modified: 17 Nov 2013 | 1:50:47 UTC

Last time I tried to get the drivers to work for my GPU's I forgot I didn't have the monitor plugged in, and ended up at a black screen with an X for an arrow. It might be easier to go headless but I might have to build an image outside of the VM and then import it, some how. Next time I try the drivers I'll remember to clone first and plug in a monitor!


I've never tried headless but have heard others discussing it. For headless I believe you need to install a virtual monitor to fool X into thinking a monitor is attached, otherwise X won't boot. I believe X initializes the driver so if X doesn't boot, BOINC client won't see the GPU.

I still can't get BOINC to see the GPU in virtual Ubuntu. The driver installer definitely checks to see if a supported GPU is installed but even if there is no GPU it allows you to install the driver anyway. Even after an OS reboot, BOINC client doesn't see the GPU. Now I really am out of ideas so it's time to Google and ask for help in nvidia and vbox forums.
____________
BOINC <<--- credit whores, pedants, alien hunters

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 33918 - Posted: 17 Nov 2013 | 10:38:57 UTC - in response to Message 33914.

You can force X to start even in the absence of an attached monitor by adding:

Option "ConnectedMonitor" "crt"

to the Device section of the xorg.conf file.

But you don't even need to be running X. You can simply load the nvidia module with "sudo modprobe nvidia", then run the BOINC command line client.

Matt

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 33919 - Posted: 17 Nov 2013 | 10:41:29 UTC - in response to Message 33896.


My turn to ask a question: Why are you gathering this information - do you have any sort of plans for a VM app or maybe a new VM project?



Following up from the plans for a multicore app, I'm pondering deploying it in a VM to avoid the cost of porting to Windows.

BTW, passthrough of GPUs to the VM isn't possible with VirtualBox (the only supported configuration is QEMU and a Tesla/Quadro)

Matt

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33920 - Posted: 17 Nov 2013 | 13:51:31 UTC
Last modified: 17 Nov 2013 | 13:54:03 UTC

I wanted to mention one more thing that occurred to me, regarding VMs. There are sort of 2 different ways of handling the downloading/uploading of the work.

Method 1:
The way it was intended by the BOINC devs (used by RNA World): BOINC downloads the VM and the work, and the VM is fired up to run that 1 task. A "shared folder" is setup within the slot directory, where BOINC deposits input files (used by the VM), and later grabs output files (given by the VM). When the task is done, the VM is fired down and the result it uploaded by BOINC.

Method 2:
The way where downloading/uploading of work is managed within the VM (used by Test4Theory): BOINC downloads the VM, but the VM does some network connections to get 1-or-more tasks, or even get new tasks in serial after the prior task completes, and the VM runs indefinitely or it runs until it has been configured to shutdown.

Test4Theory uses method 2, where they run a VM for 24 hours, and in that time frame, the VM uses networking to get a task, run it, report it, get another, etc., several times within that 24 hours.

I've personally had some problems with the way Test4Theory does this. There are times when their servers (called "copilot"?) aren't working, and then an entire CPU is completely wasted, because BOINC "allocated/gave" that task the CPU, and the VM doesn't utilize it. That was a bitter pill for me to swallow, since I desperately desire to keep all of my resources fully-utilized.

Method 1 is preferred, so far as I know.

Regards,
Jacob

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33921 - Posted: 17 Nov 2013 | 15:05:55 UTC - in response to Message 33919.

I suggest you release a native Linux app first, and test it in the wild.
Then build a VM to facilitate Windows only crunchers, if you think you need to.
Then, and only if the VM doesn't hold up, port to Windows.

Anyone using Windows can download VirtualBox, install Ubuntu as a VM, install Boinc and attach to whatever CPU projects they like, without any need for projects to build VM's. Other benefits include better resource control for the user (cruncher), a simplified system (with potential for isolation of GPU and CPU projects), no project VM download overhead, no project work involved and no waiting for further development.

Is that so much harder for a GPU cruncher (who can install cards, drivers, tools and configure Boinc) than downloading the Boinc client along with VirtualBox?

There has been Linux images with Boinc already installed for years, not that installing a VM or Boinc is in any way difficult.

The big issue with the Boinc+VirtualBox on Windows is that each Boinc project has to use a separate VM to run apps. That means big overheads, more potential bugs and problems, and lesser isolation from Windows. In a years time people could be attached to 5 or 10 projects using VMs. They would need plenty of RAM to run anywhere near capacity and a plump disk drive. While the VM development have facilitated the likes of RNA World, it brings more problems and generally increases complexity of Boinc crunching. I don't think every project need add to that with a bespoke VM.

When the app is out and tested in a native Linux environment vs a manual VM, you will then know what the performance difference is and whether or not it's worth the trouble building and testing for Windows environments (XP, Vista, W7, W8, servers). BTW some apps such as SLinCA@Home don't run well in VM's (reasonable for one task, bad for two, terrible for many).

The early users of VM's do so for good reasons (to enable a sort of checkpoint in the case of RNA World), and setup their VM's accordingly.


PS. I thought that VT-d capable processors might allow VM I/O to the GPU, but they have been around since 2008, and nothing yet. It makes sense that there is a GPU dependency too (GPU's also have to facilitate this, and they don't unless its a Tesla/Quadro). Presumably it's been removed from the other GK110's (Titan, 780, 780Ti).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33922 - Posted: 17 Nov 2013 | 16:43:01 UTC - in response to Message 33921.

Your concerns regarding numerous VMs, one from each of several projects, is valid. If that trend continues it won't turn out well for any project. A better approach is one (maybe two) VMs that can run any and all projects that need to run in a VM.

Regarding Tesla and Quadro working in a VM, try GeForce GTX780 to Tesla K20 . There are links for GK104 too, gonna try it on my 660Ti soonest. I'll be buying as many 780 as I can and keeping copies of recent drivers because nVidia will put a stop to this somehow.

____________
BOINC <<--- credit whores, pedants, alien hunters

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33925 - Posted: 17 Nov 2013 | 20:58:05 UTC - in response to Message 33922.
Last modified: 17 Nov 2013 | 21:09:52 UTC

Regarding the potential use of GPU's within a VM,
A quote from, http://askubuntu.com/questions/202926/how-to-use-nvidia-geforce-m310-on-ubuntu-12-10-running-as-guest-in-virtualbox

    PCI passthrough

    PCI passthrough is experimentally supported in recent Virtual Box closed source (PUEL) versions. However there are several limitation, i.e. for a graphics card we can read from the Virtual Box User Manual:

    AGP and certain PCI Express cards are not supported at the moment if they rely on GART

    I cant tell if this is the case with the Nvidia M310, you will have to figure this out first before you try.

    To get PCI passthrough working we also need a motherboard with an enabled IOMMU from BIOS settings (i.e. VT-d for Intel, AMD-Vi for AMD).

    There are several additional prerequisites to be met (see Virtual Box Manual for details):

    Your motherboard has an IOMMU unit.
    Your CPU supports the IOMMU.
    The IOMMU is enabled in the BIOS.
    The VM must run with VT-x/AMD-V and nested paging enabled.
    Your Linux kernel was compiled with IOMMU support, DMA remapping, and the PCI stub driver.
    Your Linux kernel recognizes and uses the IOMMU unit.

    We can then attach a PCI device with its bus:device.function properties read from lspci to our virtual machine using:

    VBoxManage modifyvm "VM name" --pciattach <host-bus>:<host-device>.<host-function>@<guest-bus>:<guest-device>.<guest-function>.

    See in the guest with lspci if the device was attached properly before installing drivers for this device.

    Please also refer to the Virtual Box Manual for further limitations.



My CPU is a K model (not VT-d), a show stopper for me!

PS. I think the existing VM CPU projects also require VT-x/AMD-V. While some motherboards have this automatically enabled, you have to set it in others. To an amateur cruncher configuring the Bios is probably just as hard as installing Boinc in a VM. While most modern CPU's have either VT-x or AMD-V some Intel processors dont.
For reference.
List of Intel processors with VT-X
List of Intel processors with VT-d
List of Intel processors without VT-d
- Might be useful for projects to test for VT-x/AMD-V before downloading an 800MB VM!

I had read about the GK104 hacks before but not the GK110 hacks.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33941 - Posted: 20 Nov 2013 | 2:57:23 UTC
Last modified: 20 Nov 2013 | 3:05:36 UTC

I have solved my "Host responsiveness" issue that I reported as #8 in my list.
Basically, Oracle VirtualBox has had a bug since v4.2, and the bug still affects the current release, v4.3.2. Read on for a solution.

The bug manifests itself as huge DPC Latencies in the host environment, possibly only on x64 Windows hosts. For me, this meant that the mouse cursor acceleration was misinterpreted, my mouse cursor skipped around irregularly, my music sometimes skipped, and my windows/controls sometimes lagged when being painted/refreshed/moved. My workaround had been to only allow 1 VM to run at a time, since the problem dissipated with just 1 VM running (problem still existed, just not as bad). Keep reading for that solution.

The bug is documented a bit in this thread:
https://forums.virtualbox.org/viewtopic.php?f=6&t=58030&sid=029eb906ea896f79a218b8d33ed399cf

The VirtualBox devs believe they've fixed the problem, and they gave a link to a 4.3.3 Test version at the top of page 2 in that thread.

That 4.3.3 Test version does indeed resolve my host responsiveness issue. I can now run 3 VMs, plus 4 other non-VM CPU tasks, plus 3 GPU tasks, while listening to music, while playing a game.... all without a single problem. Life is good again.

I expect the 4.3.3 official release (no ETA yet) will include this bug fix.

Regards,
Jacob

Post to thread

Message boards : Multicore CPUs : VM apps

//