Yet another cool WebGL demo

Karlos · Dec 13, 2011

ilwrath said:
Admittedly, I know diddly squat about the ultra-modern PC architecture. But wouldn't it be trivial to grab the main CPU, if you have complete control over the programmable GPU? Couldn't you just use the GPU's DMA to modify whatever system RAM you want? I don't think the OS would know how to defend against a hostile PCIe card, and I don't know of anything that would ensure the GPU would respect the processors mapping of data/executable regions.

I'm don't think they can arbitrarily access memory in quite that fashion, only memory that has been marked as mappable by the OS in the first place. The trend is for modern graphics cards to have 1GB or more of their own memory since from their perspective, accessing over PCIe is cripplingly slow.

Code:

:~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release$ ./bandwidthTest
[bandwidthTest] starting...
./bandwidthTest Starting...Running on...

Device 0: GeForce GTX 275
Quick Mode

Host to Device Bandwidth, 1 Device(s), Paged memory
  Transfer Size (Bytes)    Bandwidth(MB/s)
  33554432            2849.3

Device to Host Bandwidth, 1 Device(s), Paged memory
  Transfer Size (Bytes)    Bandwidth(MB/s)
  33554432            2225.0

Device to Device Bandwidth, 1 Device(s)
  Transfer Size (Bytes)    Bandwidth(MB/s)
  33554432            105164.6

[bandwidthTest] test results...
PASSED

As you can see, host <-> device performance is way slower.

ilwrath · Dec 13, 2011

Karlos said:
I'm don't think they can arbitrarily access memory in quite that fashion, only memory that has been marked as mappable by the OS in the first place.

This is the part I don't know... That may well be true, which would limit the usefulness of this attack. I've never tried coding anything for a modern GPU. I have no idea what areas of RAM it can and can't access, and what APIs it does it through.

The trend is for modern graphics cards to have 1GB or more of their own memory since from their perspective, accessing over PCIe is cripplingly slow.

As you can see, host <-> device performance is way slower.

Yes, definitely modern cards have their built-in RAM, and leaving that space incurs a huge performance penalty. But, you only need a few bytes to pwn the main CPU. Performance is pretty irrelevant, because does it really matter if it takes the PCIe card an extra 2 nanoseconds to modify the RAM and take the CPU over? It's not really like you have enough time to do something about it, either way.

Karlos · Dec 13, 2011

I dunno how achievable/practical that really is. Take over the CPU how, exactly?

A more likely attack vector is to try and get the device to simply overwhelm the CPU with interrupt requests or something of that nature. I managed to lock up my X Server pretty badly by trying to spawn 1,000,000 threads on my then GPU (which could handle 24K of them without switching) with an older version of CUDA (probably 1.3). Killing X from a remote session sorted that out.

CUDA, however, is much lower level than GLSL. In the latter, you are further constrained by the limits imposed by the GL shader model runtime. I think that what really concerns the security guys is that the sandbox limits of GLSL are not necessarily well-defined and that such code may be invoked while the CPU is running within kernel space (depending on the driver architecture). If a bad GLSL program manages to tie up the CPU via some sideline attack method, then it might be possible to lock up the machine. To be genuinely useful as an attack method, however, you'd be looking for a way to deploy code to the host CPU.

Lastly, the ability for programmable devices to alter system memory is not new to GPUs. SCSI script processors on PCI cards have been doing that for years.

ilwrath · Dec 13, 2011

Take over the CPU how, exactly?

Same idea as a buffer overflow. Inject your piece of code over something in RAM that you know the CPU is going to execute. Only instead of overflowing a buffer to get it there, you're using the GPU to modify the system RAM.

CUDA, however, is much lower level than GLSL. In the latter, you are further constrained by the limits imposed by the GL shader model runtime. I think that what really concerns the security guys is that the sandbox limits of GLSL are not necessarily well-defined and that such code may be invoked while the CPU is running within kernel space (depending on the driver architecture). If a bad GLSL program manages to tie up the CPU via some sideline attack method, then it might be possible to lock up the machine. To be genuinely useful as an attack method, however, you'd be looking for a way to deploy code to the host CPU.

Yah. I guess the question is, does the GLSL include something you could trick into writing a certain piece of "executable data" (a fake texture?) from the device into a specific area of system RAM. If so, I'd think that would be a big problem. If not, probably less so, as then there is a whole extra layer of defense to deal with.

Lastly, the ability for programmable devices to alter system memory is not new to GPUs. SCSI script processors on PCI cards have been doing that for years.

Very true. But as far as I know, the security for those devices has come from them never being able to run untrusted code, in the first place. There was no feature that passes outside code to a SCSI processor in a manner like GLSL passes outside code to a GPU. If you're passing code to the SCSI processor, you already own at least the SCSI driver, at which point, worrying about anything else is rather pointless...

Karlos · Dec 13, 2011

There was no feature that passes outside code to a SCSI processor in a manner like GLSL passes outside code to a GPU.

Except that it doesn't quite do that. GLSL is a C style language that is compiled at runtime, typically into a device-independent bytcode representation. The latter is then translated by the driver implementation into something specifically suited to the end device. Even CUDA, which is entirely a nVidia technology compiles your high level code down to a ptx "assembler" language, itself an intermediate stage before being compiled into a device-specific kernel. Implementations have to work this way in order to overcome the huge changes in the underlying hardware. Bad shader code will typically fail to even pass the compilation stage, resulting in a lot of stuff written to stderr from your GL libraries.

You are typically quite limited in terms of resource management within your GLSL (or even CUDA for that matter). The CPU does all the necessary set up, allocation of memory etc. The CPU won't ever allocate memory for textures and such as "executable" as all GPU resources are pure data as far as it is concerned.

You would have to hope that your buffer overrun (assuming that you could even accomplish it, anything you are writing to from within the GPU environment should be in VRAM anyway - the controller for paging stuff in and out of mapped system RAM would likely be part of the interface logic and nothing to do with the GPU core) would wander off the end of your data-only memory allocation into some executable space and alter it, before somehow getting the CPU to go there.

I guess anything is possible in the absence of strict rules on how systems work, but I shouldn't think there was a trivial exploit unless the drivers supporting the GLSL were particularly poor.

ilwrath · Dec 13, 2011

Except that it doesn't quite do that. GLSL is a C style language that is compiled at runtime, typically into a device-independent bytcode representation. The latter is then translated by the driver implementation into something specifically suited to the end device.

Ah. I was under the assumption that GLSL was a fairly low-level language. I wasn't exactly expecting that many abstraction layers before you could address a modern GPU. Yeah, that doesn't sound very easy to anything other than perhaps by accident.

Karlos · Dec 13, 2011

ilwrath said:
Ah. I was under the assumption that GLSL was a fairly low-level language. I wasn't exactly expecting that many abstraction layers before you could address a modern GPU. Yeah, that doesn't sound very easy to anything other than perhaps by accident.

Nope, you literally pass your GLSL as sourcecode in a string to the opengl system. Here's a fragment shader I often use in point simulations that turns each point into a simple shaded ball:

Code:

void main()
{
    // up, right and behind the viewer...
    const vec3 lightDir = vec3(0.577, 0.577, 0.577);

    // calculate normal from texture coordinates
    vec3 nrm;
    nrm.xy = gl_TexCoord[0].xy*vec2(2.0, -2.0) + vec2(-1.0, 1.0);
    float mag = dot(nrm.xy, nrm.xy);
    if (mag > 1.0) {
        discard;  // skip pixels outside circle
    }
    nrm.z = sqrt(1.0-mag);

    // calculate lighting
    float diffuse = max(0.0, dot(lightDir, nrm));

    gl_FragColor = gl_Color * diffuse;
}

Glaucus · Dec 13, 2011

Well now I know who to talk to when I have an OpenGL question.

Karlos · Dec 13, 2011

Glaucus said:
Well now I know who to talk to when I have an OpenGL question.

Let me know when you find him, because my GLSL skills are frankly pretty poor. I use very basic vertex and fragment shaders to make my CUDA simulations easier to look at (turning a point into a ball, for example, instantly gives you an appreciation of the depth of a point).

The guys doing the WebGL demos are actually using GLSL itself to perform the simulation, representing datasets in textures and so on.

Glaucus · Dec 13, 2011

I'm not even a beginner in OpenGL, but I want to be. Once I'm done with my Android project, I want to start on my next project which is a game. A game I originally started coding years back on my Amiga 4000. This summer I did some preliminary goofing around with the Box2D physics engine and it seems very slick and straigh forward (what took me months if not years to manually model in C on the Amiga took me about a day with Box2D). I also took a look at the Unity3D engine but not thoroughly enough to determine if it's ideal for what I need (it might be overkill for me). I also played with SDL, but decided it kinda sucks. Although I may end up using SDL for loading bitmaps, I may end up using something low level like OpenGL for my rendering with a Box2D backend. But learning OpenGL seems a lot more complicated than a simple 2D physics engine. The last time I did Graphics programming was with my good old friend Agnes. I miss her so, even if she was kinda fat....

Yet another cool WebGL demo

Karlos

Supper Moderator

ilwrath

Active Member

Karlos

Supper Moderator

ilwrath

Active Member

Karlos

Supper Moderator

ilwrath

Active Member

Karlos

Supper Moderator

Glaucus

Active Member

Karlos

Supper Moderator

Glaucus

Active Member