How to call CUDA Programs from a C/C++ Application?

Calling a CUDA function from your C/C++ file is very simple. It’s pretty straight forward as you call an extern function in C/C++. To start with, I believe you’ve added the CUDA program in your work space (or copy the program provided below and save as .cu file) and you could compile the file using CUDA compiler and finally the object files has been generated. Please check my previous post to know more about how to compile CUDA source in Visual Studio. CUDA follows C language constructs and rationales. The CUDA compiler will generate the object files which contains the functions and definitions of your CUDA program.

Let’s take a sample presented in a DDJ article CUDA, Supercomputing for the Masses: Part 2. The program increments the content of the array by one. Robb Farber has put an excellent effort to present CUDA in a simple manner in his high performance computing series on CUDA (Check DDJ). Please click on the link to know more about the program presented here.

[sourcecode language='cpp']
// incrementArray.cu
#include
#include
#include

void incrementArrayOnHost(float *a, int N)
{
int i;
for (i=0; i < N; i++) a[i] = a[i]+1.f;
}
__global__ void incrementArrayOnDevice(float *a, int N)
{
int idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx }

extern "C" void IncrementArray(void)
{
float *a_h, *b_h; // pointers to host memory
float *a_d; // pointer to device memory
int i, N = 10;
size_t size = N*sizeof(float);
// allocate arrays on host
a_h = (float *)malloc(size);
b_h = (float *)malloc(size);
// allocate array on device
cudaMalloc((void **) &a_d, size);
// initialization of host data
for (i=0; i // copy data from host to device
cudaMemcpy(a_d, a_h, sizeof(float)*N, cudaMemcpyHostToDevice);
// do calculation on host
incrementArrayOnHost(a_h, N);
// do calculation on device:
// Part 1 of 2. Compute execution configuration
int blockSize = 4;
int nBlocks = N/blockSize + (N%blockSize == 0?0:1);
// Part 2 of 2. Call incrementArrayOnDevice kernel
incrementArrayOnDevice <<< nBlocks, blockSize >>> (a_d, N);
// Retrieve result from device and store in b_h
cudaMemcpy(b_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost);
// check results
for (i=0; i // cleanup
free(a_h); free(b_h); cudaFree(a_d);
}
[/sourcecode]

So the above program creates a simple CUDA program which increments the contents of each element in the array by one. What I changed from the original source code is, I made the main function as a new function which follows the “C” language rules.[that’s why I put extern “C” in front of the code. Even if you don’t add extern “C” it will work fine with your C++ compiler.
Instead of including the CUDA file by #include preprocessor, it’s better to define it as an external function by extern keyword.So your main program may appear as follows.

[sourcecode language='cpp']
// IntegrationWithCPP.cpp : Defines the entry point for the console application.
//

#include “stdafx.h”

// Forward declare the function
extern “C” void IncrementArray();

int _tmain(int argc, _TCHAR* argv[])
{
IncrementArray();
return 0;
}
[/sourcecode]

Finally link the program with cudart.lib cudartd.lib(debug) or cudart.lib(release) and enjoy your program!!!

That’s it. One thing I noticed is that, even if I define the main functions in CUDA file and my C++ file, I’m not getting any error from the linker. The linker gives error only if the main function in CUDA and C++ file having same prototype. Otherwise the version in the CUDA file will be called (from my experience so far).

Sharing my thoughts...

How to Integrate CUDA with Visual C++

If you want to program CUDA, first you will have to install latest nVidia CUDA Driver for your Graphics Hardware which supports desired CUDA version. Then you will have to install the CUDA Toolkit which includes the CUDA Compiler, Include file, lib file and binary files to develop your CUDA application.

The bin folder under your CUDA installation location (most probably C:\CUDA), you can see nvcc.exe which helps you to compile the CUDA program. If you give, nvcc my_cuda_filename.cu, the compiler will compile the source file and creates the executables (a.exe).

But in Windows world, most of the developers are much satisfied with the IDE Visual Studio. So may have to leave the world of command line compilation and source editing in favor of improving our productivity. If we can integrate the CUDA development to Visual Studio IDE, that’s pretty nice no? In one of my previous post, I said about enabling syntax highlighting for CUDA files under Visual Studio.

Now let’s check how we can support CUDA compilation under Visual Studio. CUDA compiler has a dependency with C++ compiler. It supports either Visual C++ 7.1 or 8.0. Currently it doesn’t support Visual C++ 9 (VS 2008).

Method 1 – Install CUDA Build Rule for Visual Studio 2005

There’s a painless method by installing a custom build rule for Visual Studio 2005 developed by JaredHoberock. A cool installer is available to do the necessary settings. When you add the “.cu” files to your application, it shows a dialog box to select the rule you installed. Press “OK” button.

Select CUDA Build Rule

Select CUDA Build Rule

You can control the per-file (.cu file) configuration for your application by selecting the properties of the file from your solution explorer. See the figure below.

Detailed Build Configuration under Wizard

Detailed Build Configuration under Wizard

The installer also adds new Project Type to create your new CUDA project which does the necessary settings for your application.

Wizard to create new CUDA Project

Wizard to create new CUDA Project

Finally you can put the required CUDA libraries (e.g CUDA Runtime library – cudart.lib) in Project->Properties->Linker->Input->Additional Dependency.

I think the installer won’t work for Visual Studio 2003. I think you can manually copy the rule file from $:\Program Files\Microsoft Visual Studio 8\VC\VCProjectDefaults\cuda.rules and specify this as the rule file for compilation. Anyway try it yourself.

Method 2 – Manually Configure by Custom Build Event

Without Visual Studio, we can use nvcc compiler for compiling the CUDA files with various command line parameters. So the same setup we can use with Visual Studio as well. One of the advantages is that, we can use various Visual Studio Specific variables to identify various file locations.

 

Custom Build

Custom Build

 

1. Select the CUDA source file from the solution explorer and take properties.

2.  Select “Custom Build Setup” from the tree

3.  You can select the active configuration from “Configuration” combo box.

4. Specify the following options under “Command Line”

“$(CUDA_BIN_PATH)\nvcc.exe” -ccbin “$(VCInstallDir)bin” -c -D_DEBUG -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/Od,/Zi,/RTC1,/MTd -I”$(CUDA_INC_PATH)” -I./ -o $(ConfigurationName)\<your_cu_filename>.obj <your_cu_filename>.cu

The above configuration is to compile the source under debug configuration. For a release configuration, you will have to remove the symbol definition, _DEBUG (-D switch is to define symbols). Also the above command line ask to link with the static-debug version of C-Runtime library by specifying /MTd option. In the case of release build you will have to modify it to /MD or /MT. Please check the various C-Runtime versions from MSDN and apply accordingly. Thus a sample command line for release version may look as follows. 

“$(CUDA_BIN_PATH)\nvcc.exe” -ccbin “$(VCInstallDir)bin” -c -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/Od,/Zi,/RTC1,/MD -I”$(CUDA_INC_PATH)” -I./ -o $(ConfigurationName)\<your_cu_filename>.obj <your_cu_filename>.cu

If you need to add Device Emulation option, you add -deviceemu switch in the command line. Thus you the compiler will generate code for GPGPU Emulation library. You can understand more about each switches from MSDN itself. The information about the other variables used has been specified at the end of this post.

5. You can specify your own description for the custom build event under “Description” text box. (See the figure)

6. Finally you have to specify the output file name under “Outputs”

$(ConfigurationName)\<your_cu_filename>.obj

7. If you  have included any CUDA files (probably will be having extension .cu or .cuh), you can specify it in the “Additional Dependencies option

In the command line string, we’ve used some visual studio specific build macros. You can see the entire list of Visual Studio Build Macros in MSDN.

8. Finally build your project and now see the output of CUDA compilation in your output Window.

Adding a new build configuration

Sometimes you have to build your file with CUDA emulation for many debugging purpose.  Please check MSDN to know more about adding build configuration.

I’ll explain how to call a CUDA Program from a C++ file in the next post. Thanks for your patience. I’m a newbie in CUDA. If you’ve better ideas and suggestions, please share with me through your comments.

 

Sharing my thoughts...

Describing Windows XP and Vista Shutdown in Application context

In this post, I’m talking about some internals of Windows Shutdown in the context of applications.

When the user or an application initiates shutdown request through ExitWindowsEx function, the Csrss.exe will be initiating the shutdown operation by sending a window message to the hidden window owned by Winlogon process. The subsystem process Csrss.exe contains handles, Console (text) windows, creation and deletion of processes and threads and other various supports like ExitWindowEx etc…

The Winlogon process calls ExitWindowsEx function with some internal flags to continue the shutdown based on the security access rights of the currently-logged in user. Since the ExitWindowEx call has been made again, it falls back in the hand of Csrss.exe.

Csrss sends the WM_QUERYENDSESSION message to each thread in the process that has a Windows message loop. This message is like warning from the system that, the application has to respond for the shutdown operation. The shutdown can be preceded when the thread returns TRUE. If the thread returns FALSE, in most cases, the shutdown operation will be cancelled. The shutdown should be cancelled only for the valid reasons like CD burning, live recording etc…

clip_image002Windows Vista supports more interactive shutdown(When the screen like left appears and I wondered when try to shutdown while the office applications are running. Now got the nuts and bolts of it :D ). One or many applications can cancel the shutdown operation under windows XP. In windows vista, if either of the application cancels shutdown operation, windows shows the list of currently running application and also the blocking application on the top of the window. If the user wants to shut his computer down, he can forcefully do that by clicking on the “Shut down now” button(red button in the figure). In this case windows sends WM_QUERYENDSESSION again with ENDSESSION_FORCEFULSHUTDOWN flag. If an application responds FALSE, Windows will continue shutdown instead of canceling it.

Windows Vista provides a facility to pro-actively describe about the reason for cancelling the shutdown. There are three new APIs at our disposal to manage shutdown cancellation string.

[sourcecode language='cpp']
BOOL ShutdownBlockReasonCreate(HWND hWnd, LPCWSTR pwszReason);
BOOL ShutdownBlockReasonDestroy(HWND hWnd);
BOOL ShutdownBlockReasonQuery(HWND hWnd, LPWSTR pwszBuff, DWORD *pcchBuff);
[/sourcecode]


The above functions are used to create, destroy and query the cancellation string for the window. The best time to create the string is at WM_QUERYENDSESSION function and destroys it once the application finished using it. (See the sample below for the usage)


Once the Query end session function returns true or managed to get out from the above described scenarios, Csrss then sends the WM_ENDSESSION Windows message to the thread to request it to exit. The application will get 5 second as time out as default (it’s defined in HKCU\Control Panel\Desktop\HungAppTimeout). If an application times out responding to WM_QUERYENDSESSION or WM_ENDSESSION, Windows will terminate it.

The behavior of the application time out is slightly different in Windows Vista and XP as we see in the picture below. Please check MSDN (I don’t want to copy paste it here) to get the detailed table.

If the thread doesn’t exit before the timeout, Csrss displays the hung-program dialog box (End Now). You can disable this dialog box by changing the registry value HKCU\Control Panel\Desktop\AutoEndTasks to 1. This dialog box indicates that a program isn’t shutting down in a timely manner and gives the user a choice of either killing the process or aborting the shutdown. There is no timeout on this dialog box, which means that a shutdown request could wait forever at this point.

Once all the threads that own windows in the process have exited, Csrss terminates the process and goes on to the next process in the interactive session.

OK Now you’re capable of handling the shutdown message for your application. Here’s a sample snippet for doing the same (MFC)

[sourcecode language='cpp']
/* Map the following functions manually in your application (Win32 style)
*/

BEGIN_MESSAGE_MAP(CShutDownTestDlg, CDialog)
ON_MESSAGE( WM_QUERYENDSESSION, QueryEndSession )
ON_MESSAGE( WM_ENDSESSION, EndSession )
//}}AFX_MSG_MAP
END_MESSAGE_MAP()

LRESULT CShutDownTestDlg::QueryEndSession(WPARAM,LPARAM)
{
// Block shut down
ShutdownBlockReasonCreate(m_hWnd, L”TV Show being recorded” );
return FALSE;
// return TRUE; // continue shutdown
}

LRESULT CShutDownTestDlg::EndSession(WPARAM,LPARAM)
{
// Destroy the shutdown string used earlier
ShutdownBlockReasonDestroy(m_hWnd);
return TRUE;
}
[/sourcecode]

Here’s the pure MFC way to do the same

[sourcecode language='cpp']
/* Add the function using class wizard or through property window (VS2003 or above)
*/

BEGIN_MESSAGE_MAP(CShutDownTestDlg, CDialog)
ON_WM_QUERYENDSESSION()
ON_WM_ENDSESSION()
//}}AFX_MSG_MAP
END_MESSAGE_MAP()

BOOL CShutDownTestDlg::OnQueryEndSession()
{
// Block shut down
ShutdownBlockReasonCreate(m_hWnd, L”TV Show being recorded” );
return FALSE;

}

void CShutDownTestDlg::OnEndSession(BOOL bEnding)
{
// Destroy the shutdown string used earlier
ShutdownBlockReasonDestroy(m_hWnd);
CDialog::OnEndSession(bEnding);
}
[/sourcecode]

Hope you enjoyed this lengthy post. Thanks for MSDN articleand Mark Russinovich for his book Windows Internals. Those were the light for my path.

Sharing my thoughts...