Mercurial > hg > truffle
annotate src/gpu/ptx/vm/ptxKernelArguments.cpp @ 11842:8d8f63069f58
PTX warp limiter to available GPU processors
author | Morris Meyer <morris.meyer@oracle.com> |
---|---|
date | Mon, 30 Sep 2013 13:03:47 -0400 |
parents | d8659ad83fcc |
children | c7abc8411011 |
rev | line source |
---|---|
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
1 /* |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
2 * Copyright (c) 2013, Oracle and/or its affiliates. All rights reserved. |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
3 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
4 * |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
5 * This code is free software; you can redistribute it and/or modify it |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
6 * under the terms of the GNU General Public License version 2 only, as |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
7 * published by the Free Software Foundation. |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
8 * |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
9 * This code is distributed in the hope that it will be useful, but WITHOUT |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
10 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
11 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
12 * version 2 for more details (a copy is included in the LICENSE file that |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
13 * accompanied this code). |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
14 * |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
15 * You should have received a copy of the GNU General Public License version |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
16 * 2 along with this work; if not, write to the Free Software Foundation, |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
17 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
18 * |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
19 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
20 * or visit www.oracle.com if you need additional information or have any |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
21 * questions. |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
22 * |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
23 */ |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
24 |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
25 #include "precompiled.hpp" |
11596
91e5f927af63
Initial implementation of PTXRuntime (RegisterConfig, PTX description etc); guarded with new flag UseGPU. Specify -XX:+UseGPU to exercise this new implementation.
bharadwaj
parents:
11485
diff
changeset
|
26 #include "ptxKernelArguments.hpp" |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
27 #include "runtime/javaCalls.hpp" |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
28 |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
29 gpu::Ptx::cuda_cu_memalloc_func_t gpu::Ptx::_cuda_cu_memalloc; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
30 gpu::Ptx::cuda_cu_memcpy_htod_func_t gpu::Ptx::_cuda_cu_memcpy_htod; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
31 |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
32 // Get next java argument |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
33 oop PTXKernelArguments::next_arg(BasicType expectedType) { |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
34 assert(_index < _args->length(), "out of bounds"); |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
35 |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
36 oop arg = ((objArrayOop) (_args))->obj_at(_index++); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
37 assert(expectedType == T_OBJECT || |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
38 java_lang_boxing_object::is_instance(arg, expectedType), "arg type mismatch"); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
39 |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
40 return arg; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
41 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
42 |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
43 void PTXKernelArguments::do_int() { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
44 if (is_after_invocation()) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
45 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
46 } |
11596
91e5f927af63
Initial implementation of PTXRuntime (RegisterConfig, PTX description etc); guarded with new flag UseGPU. Specify -XX:+UseGPU to exercise this new implementation.
bharadwaj
parents:
11485
diff
changeset
|
47 // If the parameter is a return value, |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
48 if (is_return_type()) { |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
49 // Allocate device memory for T_INT return value pointer on device. Size in bytes |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
50 int status = gpu::Ptx::_cuda_cu_memalloc(&_return_value_ptr, T_INT_BYTE_SIZE); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
51 if (status != GRAAL_CUDA_SUCCESS) { |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
52 tty->print_cr("[CUDA] *** Error (%d) Failed to allocate memory for return value pointer on device", status); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
53 _success = false; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
54 return; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
55 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
56 // Push _return_value_ptr to _kernelBuffer |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
57 *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]) = _return_value_ptr; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
58 _bufferOffset += sizeof(_return_value_ptr); |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
59 } else { |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
60 // Get the next java argument and its value which should be a T_INT |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
61 oop arg = next_arg(T_INT); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
62 // Copy the java argument value to kernelArgBuffer |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
63 jvalue intval; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
64 if (java_lang_boxing_object::get_value(arg, &intval) != T_INT) { |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
65 tty->print_cr("[CUDA] *** Error: Unexpected argument type; expecting T_INT"); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
66 _success = false; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
67 return; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
68 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
69 *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]) = intval.i; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
70 _bufferOffset += sizeof(intval.i); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
71 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
72 return; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
73 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
74 |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
75 void PTXKernelArguments::do_long() { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
76 if (is_after_invocation()) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
77 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
78 } |
11596
91e5f927af63
Initial implementation of PTXRuntime (RegisterConfig, PTX description etc); guarded with new flag UseGPU. Specify -XX:+UseGPU to exercise this new implementation.
bharadwaj
parents:
11485
diff
changeset
|
79 // If the parameter is a return value, |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
80 if (is_return_type()) { |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
81 // Allocate device memory for T_LONG return value pointer on device. Size in bytes |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
82 int status = gpu::Ptx::_cuda_cu_memalloc(&_return_value_ptr, T_LONG_BYTE_SIZE); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
83 if (status != GRAAL_CUDA_SUCCESS) { |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
84 tty->print_cr("[CUDA] *** Error (%d) Failed to allocate memory for return value pointer on device", status); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
85 _success = false; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
86 return; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
87 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
88 // Push _return_value_ptr to _kernelBuffer |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
89 *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]) = _return_value_ptr; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
90 _bufferOffset += sizeof(_return_value_ptr); |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
91 } else { |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
92 // Get the next java argument and its value which should be a T_LONG |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
93 oop arg = next_arg(T_LONG); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
94 // Copy the java argument value to kernelArgBuffer |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
95 jvalue val; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
96 if (java_lang_boxing_object::get_value(arg, &val) != T_LONG) { |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
97 tty->print_cr("[CUDA] *** Error: Unexpected argument type; expecting T_LONG"); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
98 _success = false; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
99 return; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
100 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
101 *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]) = val.j; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
102 _bufferOffset += sizeof(val.j); |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
103 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
104 return; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
105 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
106 |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
107 void PTXKernelArguments::do_byte() { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
108 if (is_after_invocation()) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
109 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
110 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
111 // If the parameter is a return value, |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
112 if (is_return_type()) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
113 // Allocate device memory for T_BYTE return value pointer on device. Size in bytes |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
114 int status = gpu::Ptx::_cuda_cu_memalloc(&_return_value_ptr, T_BYTE_SIZE); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
115 if (status != GRAAL_CUDA_SUCCESS) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
116 tty->print_cr("[CUDA] *** Error (%d) Failed to allocate memory for return value pointer on device", status); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
117 _success = false; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
118 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
119 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
120 // Push _return_value_ptr to _kernelBuffer |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
121 *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]) = _return_value_ptr; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
122 _bufferOffset += sizeof(_return_value_ptr); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
123 } else { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
124 // Get the next java argument and its value which should be a T_BYTE |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
125 oop arg = next_arg(T_BYTE); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
126 // Copy the java argument value to kernelArgBuffer |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
127 jvalue val; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
128 if (java_lang_boxing_object::get_value(arg, &val) != T_BYTE) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
129 tty->print_cr("[CUDA] *** Error: Unexpected argument type; expecting T_BYTE"); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
130 _success = false; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
131 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
132 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
133 *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]) = val.b; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
134 _bufferOffset += sizeof(val.b); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
135 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
136 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
137 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
138 |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
139 void PTXKernelArguments::do_array(int begin, int end) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
140 gpu::Ptx::CUdeviceptr _array_ptr; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
141 int status; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
142 |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
143 // Get the next java argument and its value which should be a T_ARRAY |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
144 oop arg = next_arg(T_OBJECT); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
145 int array_size = arg->size() * HeapWordSize; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
146 |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
147 if (is_after_invocation()) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
148 _array_ptr = *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
149 status = gpu::Ptx::_cuda_cu_memcpy_dtoh(arg, _array_ptr, array_size); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
150 if (status != GRAAL_CUDA_SUCCESS) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
151 tty->print_cr("[CUDA] *** Error (%d) Failed to copy array argument to host", status); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
152 _success = false; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
153 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
154 } else { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
155 // tty->print_cr("device: %x host: %x size: %d", _array_ptr, arg, array_size); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
156 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
157 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
158 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
159 // Allocate device memory for T_ARRAY return value pointer on device. Size in bytes |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
160 status = gpu::Ptx::_cuda_cu_memalloc(&_return_value_ptr, array_size); |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
161 if (status != GRAAL_CUDA_SUCCESS) { |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
162 tty->print_cr("[CUDA] *** Error (%d) Failed to allocate memory for return value pointer on device", status); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
163 _success = false; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
164 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
165 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
166 status = gpu::Ptx::_cuda_cu_memcpy_htod(_return_value_ptr, arg, array_size); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
167 if (status != GRAAL_CUDA_SUCCESS) { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
168 tty->print_cr("[CUDA] *** Error (%d) Failed to copy array to device argument", status); |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
169 _success = false; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
170 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
171 } else { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
172 // tty->print_cr("host: %x device: %x size: %d", arg, _return_value_ptr, array_size); |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
173 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
174 // Push _return_value_ptr to _kernelBuffer |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
175 *((gpu::Ptx::CUdeviceptr*) &_kernelArgBuffer[_bufferOffset]) = _return_value_ptr; |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
176 _bufferOffset += sizeof(_return_value_ptr); |
11821
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
177 return; |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
178 } |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
179 |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
180 void PTXKernelArguments::do_void() { |
d8659ad83fcc
PTX single-threaded array store, Warp annotation
Morris Meyer <morris.meyer@oracle.com>
parents:
11596
diff
changeset
|
181 return; |
11485
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
182 } |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
183 |
49bb1bc983c6
Implement several missing PTX codegen features; return value capture and method args passing of java method executed on GPU.
bharadwaj
parents:
diff
changeset
|
184 // TODO implement other do_* |