Research Papers: Techniques and Procedures

2012 Freeman Scholar Lecture: Computational Fluid Dynamics on Graphics Processing Units

[+] Author and Article Information

Fellow ASME
Department of Mechanical Science and Engineering,
University of Illinois at Urbana-Champaign,
1206 W. Green Street,
Urbana, IL 61801
e-mail: spvanka@illinois.edu

Contributed by the Fluids Engineering Division of ASME for publication in the JOURNAL OF FLUIDS ENGINEERING. Manuscript received September 6, 2012; final manuscript received February 21, 2013; published online April 23, 2013. Assoc. Editor: David E. Stock.

J. Fluids Eng 135(6), 061401 (Apr 23, 2013) (23 pages) Paper No: FE-12-1431; doi: 10.1115/1.4023858 History: Received September 06, 2012; Revised February 21, 2013

This paper discusses the various issues of using graphics processing units (GPU) for computing fluid flows. GPUs, used primarily for processing graphics functions in a computer, are massively parallel multicore processors, which can also perform scientific computations in a data parallel mode. In the past ten years, GPUs have become quite powerful and have challenged the central processing units (CPUs) in their price and performance characteristics. However, in order to fully benefit from the GPUs' performance, the numerical algorithms must be made data parallel and converge rapidly. In addition, the hardware features of the GPUs require that the memory access be managed carefully in order to not suffer from the high latency. Fully explicit algorithms for Euler and Navier–Stokes equations and the lattice Boltzmann method for mesoscopic flows have been widely incorporated on the GPUs, with significant speed-up over a scalar algorithm. However, more complex algorithms with implicit formulations and unstructured grids require innovative thinking in data access and management. This article reviews the literature on linear solvers and computational fluid dynamics (CFD) algorithms on GPUs, including the author's own research on simulations of fluid flows using GPUs.

Copyright © 2013 by ASME
Your Session has timed out. Please sign back in to continue.


Nickolls, J., and Dally, W. J., 2010, “The GPU Computing Era,” IEEE MICRO, 30(2), pp. 56–69. [CrossRef]
Fatahalian, K., and Houston, M., 2008, “A Closer Look at GPUs,” Commun. ACM, 51(10), pp. 50–57. [CrossRef]
Boyd, C., 2008, “Data Parallel Computing,” ACM Queue, 6(2), pp. 31–39. [CrossRef]
Lindholm, E., Nickolls, J., Oberman, S., and Montrym, J., 2008, “NVIDIA Tesla: A Unified Graphics and Computing Architecture,” IEEE MICRO, 28(2), pp. 39–55. [CrossRef]
“Products & Technologies,” AMD, http:/www.amd.com/us/products
Patankar, S. V., 1980, Numerical Heat Transfer and Fluid Flow, McGraw Hill, New York.
Fletcher, C. A. J., 1991, Computational Techniques for Fluid Dynamics, Springer, Berlin.
Anderson, D. A., Tannehill, J. C., and Pletcher, R. H., 1984, Computational Fluid Mechanics and Heat Transfer, Hemisphere, New York.
Ferziger, J. H., and Peric, M., 2002, Computational Methods for Fluid Dynamics, 3rd ed., Springer Verlag, Berlin.
“CFD and CAE Products – CD-adapco,” CD-adapco, http://www.cd-adapco.com/products/
“COMSOL Multiphysics Engineering Simulation Software,” COMSOL, http://www.comsol.com/products/multiphysics/
“ESI Group – Fluid Dynamics,” ESI, http://www.esi-group.com/products/Fluid-Dynamics
Metacomp Technologies, http://www.metacomptech.com/
Pope, S. B., 2000, Turbulent Flows, Cambridge University, Cambridge, England.
Gorder, P. F., 2007, “Multicore Processors for Science and Engineering,” Comput. Sci. Eng., 9(2), pp. 3–7 [CrossRef].
Geer, D., 2005, “Chip Makers Turn to Multicore Processors,” Computer, 38(5), pp. 11–13. [CrossRef]
Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., and Phillips, J. C., 2008, “GPU Computing,” Proc. IEEE, 96(5), pp. 879–899. [CrossRef]
Kirk, D. B., and Hwu, W. W., 2010, Programming Massively Parallel Processors: A Hands-On Approach (Applications of GPU Computing Series), Morgan Kaufman, Burlington, MA.
Liu, G. R., and Liu, M. B., 2003, Smoothed Particle Hydrodynamics: A Meshfree Particle Method, World Scientific, Singapore.
Succi, S., 2001, The Lattice Boltzmann Equation for Fluid Dynamics and Beyond, Oxford University, New York.
Bird, G. A., 1994, Molecular Gas Dynamics and the Direct Simulation of Gas Flows, Oxford University, New York.
“Parallel Programming and Computing Platform: CUDA,” NVIDIA, http://www.nvidia.com/object/cuda_home_new.html
Nickolls, J., Buck, I., Garland, M., and Skadron, K., 2008, “Scalable Parallel Programming With CUDA,” ACM Queue, 6(2), pp. 41–53. [CrossRef]
Halfhill, T. R., 2008, “Parallel Processing With CUDA,” Microprocessor Rep., Jan. 28, 2008.
Sanders, J., and Kandrot, E., 2011, CUDA by Example: An Introduction to General-Purpose GPU Programming, Addison-Wesley, New Jersey.
Cook, S., 2011, CUDA Programming: A Developer's Guide to Parallel Computing With GPUs, Morgan Kaufmann, Burlington, MA.
Farber, R., 2011, CUDA Application Design and Development, Elsevier, New York.
Tsuchiyama, R., Nakamura, T., Iizuka, T., Asahara, A., Son, J., and Miki, S., 2012, The OpenCL Programming Book, Fixstars Corporation, Japan.
PGI CUDA FORTRAN Compiler, The Portland Group, http://www.pgroup.com/resources/accel_files/index.htm
Harlow, F. H., and Welch, J. E., 1965, “Numerical Calculation of Time-Dependent Viscous Incompressible Flow of Fluid With a Free Surface,” Phys. Fluids, 8(12), pp. 2182–2189. [CrossRef]
Hockney, R. W., and Jesshope, C. R., 1981, Parallel Computers, Adam Hilger, Bristol, UK.
Greenbaum, A., 1997, Iterative Methods for Solving Linear Systems, SIAM, Philadelphia.
Saad, Y., 2003, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia.
Hockney, R. W., 1965, “A Fast Direct Solution of Poisson's Equation Using Fourier's Analysis,” J. ACM, 12(1), pp. 95–113. [CrossRef]
Allmann, S., Rauber, T., and Runger, G., 2001, “Cyclic Reduction on Distributed Shared Memory Machines,” Euromicro Conference on Parallel Distributed and Networked-Based Processing, IEEE Computer Society, pp. 290–297. [CrossRef]
Lambiotte, J. J., and Voigt, R. G., 1975, “The Solution of Tridiagonal Linear Systems on the CDC STAR-100 Computer,” ACM Trans. Math. Softw., 1(4), pp. 308–329. [CrossRef]
Muller, S. M., and Sheerer, D., 1991, “A Method to Parallelize Tridiagonal Solvers,” Parallel Comput., 17, pp. 181–188. [CrossRef]
Stone, H. S., 1973, “An Efficient Parallel Algorithm for the Solution of a Tridiagonal Linear System of Equations,” J. ACM, 20(1), pp. 27–38. [CrossRef]
Ho, C. T., and Johnson, S. L., 1990, “Optimizing Tridiagonal Solvers for Alternating Direction Methods on Boolean Cube Multiprocessors,” SIAM (Soc. Ind. Appl. Math.) J. Sci. Stat. Comput., 11(3), pp. 563–592. [CrossRef]
Egecioglu, O., Koc, C. K., and Laub, A. J., 1989, “A Recursive Doubling Algorithm for Solution of Tridiagonal Systems on Hypercube Multiprocessors,” J. Comput. Appl. Math., 27, pp. 95–108. [CrossRef]
Zhang, Y., Cohen, J., and Owens, J. D., 2010, “Fast Tridiagonal Solvers on the GPU,” Proceedings of the 15th ACM SIGPLAN Symposium on the Principles and Practice of Parallel Programming, pp. 127–136. [CrossRef]
Davidson, A., Zhang, Y., and Owens, J. D., 2011, “An Auto-Tuned Method for Solving Large Tridiagonal Systems on the GPU,” Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium, pp. 956–965. [CrossRef]
Egloff, D., 2010, “High Performance Finite Difference PDE Solvers on GPUs,” QuantAlea GmbH Technical Report.
Sakharmykh, N., 2010, “Efficient Tridiagonal Solvers for ADI Methods and Fluid Simulation,” NVIDIA GPU Technology Conference.
Goddeke, D., and Strzodka, R., 2011, “Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed Precision Multigrid,” IEEE Trans. Parallel Distrib. Syst., 22(1), pp. 22–32. [CrossRef]
Bolz, J., Farmer, I., Grinspun, E., and Schroder, P., 2003, “Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid,” ACM Trans. Graphics, 22(3), pp. 917–924. [CrossRef]
Goodnight, N., Woolley, C., Lewin, G., Luebke, D., and Humphreys, G., 2003, “A Multigrid Solver for Boundary Value Problems Using Programmable Graphics Hardware,” SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, pp. 1–11.
Bahi, J. M., Couturier, R., and Khodja, L. Z., 2011, “Parallel Sparse Linear Solver GMRES for GPU Clusters With Compression of Exchanged Data,” Lect. Notes Comput. Sci., 7155, pp. 471–480. [CrossRef]
Amador, G., and Gomes, A., 2009, “Linear Solvers for Stable Fluids: GPU vs CPU,” Proceedings of the 17th Encontro Português de Computação Gráfica (EPCG’09), pp. 145–153.
Gaikwad, A., and Toke, I. M., 2010, “Parallel Iterative Linear Solvers on GPU: A Financial Engineering Case,” Proceedings of the 2010 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 607–614 [CrossRef].
Li, R., and Saad, Y., 2013, “GPU-Accelerated Preconditioned Iterative Linear Solvers,” J. Supercomput., 63, pp. 443–466. [CrossRef]
Jost, T., Contassot-Vivier, S., and Vialle, S., 2010, “An Efficient Multi-Algorithms Sparse Linear Solver for GPUs,” Parallel Computing: From Multicores and GPU's to Petascale, Vol. 19, IOS Press, Amsterdam, The Netherlands, pp. 546–553 [CrossRef].
Haase, G., Liebmann, M., Douglas, C. C., and Plank, G., 2010, “A Parallel Algebraic Multigrid Solver on Graphics Processing Units,” Lect. Notes Comput. Sci., 5938, pp. 38–47. [CrossRef]
Wiggers, W. A., Bakker, V., Kokkeler, A. B. J., and Smit, G. J. M., 2007, “Implementing the Conjugate Gradient Algorithm on Multi-Core Systems,” International Symposium on System-on-Chip (ISSOC), Tampere, Finland, Nov. 19–21, pp. 1–4 [CrossRef].
Cevahir, A., Nukada, A., and Matsuoka, S., 2009, “Fast Conjugate Gradients With Multiple GPUs,” International Conference on Computational Sciences (ICCS), Vol. 5544, Springer, New York, pp. 893–903 [CrossRef].
Liu, X., Liu, Z., Tan, S. X.-D., and Gordon, J., 2012, “Full-Chip Thermal Analysis of 3D ICs With Liquid Cooling by GPU-Accelerated GMRES Method,” ISQED (2012), pp. 123–128 [CrossRef].
Heuveline, V., Lukarski, D., and Weiss, J. P., 2012, “Fine-Grained Parallel Preconditioners for Fast GPU-Based Solvers,” NVIDIA GPU Technology Conference, San Jose, CA, May.
Kruger, J., and Westermann, R., 2003, “Linear Algebra Operators for GPU Implementation of Numerical Algorithms,”ACM Trans. Graphics, 22(3), pp. 908–913. [CrossRef]
Williams, S., Vuduc, R., Oliker, L., Shalf, J., Yelick, K., and Demmel, J., 2009, “Optimizing Sparse Matrix-Vector Multiply on Emerging Multicore Platforms,” Parallel Comput., 35(3), pp. 178–194. [CrossRef]
Williams, S., Bell, N., Choi, J., Garland, M., Oliker, L., and Vu, R., 2010, “Sparse Matrix-Vector Multiplication on Multicore and Accelerators,” Scientific Computing With Multicore and Accelerators, CRC Press, Boca Raton, FL. [CrossRef]
Bell, N., and Garland, M., 2008, “Efficient Sparse Matrix-Vector Multiplication on CUDA,” NVIDIA Technical Report No. NVR 2008-004.
Baskaran, M., and Bordawekar, R., 2008, “Optimizing Sparse Matrix-Vector Multiplications on GPUs,” IBM Technical Report No. RC 24704.
Buatois, L., Caumon, G., and Levy, B., 2009, “Concurrent Number Cruncher – GPU Implementation of a General Sparse Linear Solver,” Int. J. Parallel, Emergent, Distrib. Syst., 24(3), pp. 205–223. [CrossRef]
Tomov, S., Nath, R., Ltaief, H., and Dongarra, J., 2010, “Dense Linear Algebra Solvers for Multicore With GPU Accelerators,” IEEE International Symposium on Parallel & Distributed Processing, pp. 1–8. [CrossRef]
Weber, P., Du, R., Luszczek, P., Tomov, S., Peterson, G., and Dongarra, J., 2012, “From CUDA to OpenCL: Towards a Performance-Portable Solution for Multi-Platform GPU Programming,” Parallel Comput., 38(8), pp. 391–407. [CrossRef]
Buttari, A., Langon, J., Kurzak, J., and Dongarra, J., 2009, “A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures,” Parallel Comput., 35(1), pp. 38–53. [CrossRef]
“GPGPU.org: General-Purpose Computation on Graphics Processing Units,” GPGPU, http://www.gpgpu.org
Humphrey, J. R., Price, D. K., Spagnoli, K. E., Paolini, A. L., and Kelmelis, E. J., 2010, “CULA: Hybrid GPU Accelerated Linear Algebra Routines,” Proc. SPIE, 7705, p. 770502. [CrossRef]
Volkov, V., and Demmel, J. W., 2008, “Benchmarking GPUs to Tune Dense Linear Algebra,” Proc. 2008 ACM/IEEE Conference on Supercomputing, pp. 31–41.
Vuduc, R., Chandramowlishwaran, A., Choi, J., Guney, M., and Shringarpure, A., 2010, “On the Limits of GPU Acceleration,” Proc. USENIX Wkshp. Hot Topics in Parallelism (HotPar), Berkeley, CA, June.
Agarwal, R. K., 1989, “Development of a Navier-Stokes Code on a Connection Machine,” Proc. of the 9th AIAA Computational Fluid Dynamics Conference, Buffalo, NY, June, AIAA, Paper No. 89-1938, pp. 103–108. [CrossRef]
Agarwal, R. K., and Lewis, J. C., 1992, “Computational Fluid Dynamics on Parallel Processors,” Comput. Syst. Eng., 3(1–4), pp. 251–259. [CrossRef]
Levit, C., and Jespersen, D., 1988, “Explicit and Implicit Solution of Navier-Stokes Equations on a Massively Parallel Computer,” Comput. Struct., 30(1–2), pp. 385–393. [CrossRef]
Robichaux, J., Tafti, D. K., and Vanka, S. P., 1992, “Large-Eddy Simulations of Turbulence on the CM-2,” Numer. Heat Transfer, Part B, 21(3), pp. 367–388. [CrossRef]
Wang, G., 1996, “Large Eddy Simulations of Bluff-Body Wakes on Parallel Computers,” Ph.D. thesis, University of Illinois at Urbana, Champaign, IL.
Kass, M., and Miller, G., 1990, “Rapid, Stable Fluid Dynamics for Computer Graphics,” Computer Graphics (Proc. of SIGGRAPH 90), pp. 49–57. [CrossRef]
Stam, J., 1999, “Stable Fluids,” Proc. 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), pp. 121–128. [CrossRef]
Stam, J., 2001, “A Simple Fluid Solver Based on FFT,” J. Graph Tools, 6(2), pp. 43–52. [CrossRef]
Harris, M., 2004, “Fast Fluid Dynamics Simulation on the GPU,” GPU Gems, Pearson Education, Boston, MA, pp. 637–665.
Amador, G., and Gomes, A., 2010, “CUDA-Based Linear Solvers for Stable Fluids,” International Conference on Top of Form Information Science and Applications (ICISA), Apr. 21–23. [CrossRef]
Crane, K., Llamas, I., and Tariq, S., 2007, “Real-Time Simulation and Rendering of 3D Fluids,” GPU Gems, Vol. 3, Pearson Education, Boston, MA, pp. 633–675.
Scheidegger, C. E., Comba, J. L. D., and da Cunha, R. D., 2005, “Practical CFD Simulations on Programmable Graphics Hardware Using SMAC,” Comput. Graph. Forum, 24, pp. 715, 728. [CrossRef]
Comba, J. L. D., Dietrich, C., Pagot, C., and Scheidegger, C. E., 2003, “Computations on GPUs: From a Programmable Pipeline to an Efficient Stream Processor,” Rev. Inf. Teór. Appl., 10, pp. 41–70.
Goddeke, D., Strzodka, R., and Turek, S., 2007, “Performance and Accuracy of Hardware-Oriented Native Emulated and Mixed-Precision Solvers in FEM Simulations,” Int. J. Parallel Emergent Distrib. Syst., 22, pp. 221–256. [CrossRef]
Goddeke, D., Strzodka, R., Mohd-Yusof, J., McCormick, P., Wobker, H., Becker, C., and Turek, S., 2008, “Using GPUs to Improve Multigrid Solver Performance on a Cluster,” Int. J. CSE, 4(1), pp. 36–55. [CrossRef]
Hagen, T., Lie, K., and Natvig, J., 2006, “Solving the Euler Equations on Graphics Processing Units,” Comput. Sci. (ICCS), 3994, pp. 220–227. [CrossRef]
Hagen, T. R., Hjelmervik, J. M., Lie, K. A., Natvig, J. R., and Henriksen, M. O., 2005, “Visual Simulation of Shallow Water Waves,” Simul. Model Pract. Theory, 13, pp. 716–726. [CrossRef]
Brodtkorb, A., Hagen, T. R., Lie, K. A., and Natvig, J. R., 2010, “Simulation and Visualization of the Saint-Venant System Using GPUs,” Comput. Visualization Sci., 13, pp. 341–353. [CrossRef]
Brodtkorb, A., and Hagen, T. R., 2010, “A Comparison of Three Commodity-Level Parallel Architectures: Multi-Core CPU, Cell BE and GPU,” MMCS 2008, Vol. 5862, pp. 70–80. [CrossRef]
Elsen, E., LeGresley, P., and Darve, E., 2008, “Large Calculation of the Flow Over a Hypersonic Vehicle Using a GPU,” J. Comput. Phys., 227(24), pp. 10148–10161. [CrossRef]
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., and Hanrahan, P., 2003, “Brook for GPUs: Stream Computing on Graphics Hardware,” ACM Trans., 23(3), pp. 777–786. [CrossRef]
Brandvik, T., and Pullan, G., 2008, “Acceleration of a 3D Euler Solver Using Commodity Graphics Hardware,” 46th AIAA Aerospace Sciences Meeting and Exhibit, Reno, NV, Jan. 7–10, AIAA Paper No. 2008-607.
Brandvik, T., and Pullan, G., 2007, “Acceleration of a Two-Dimensional Euler Solver Using Commodity Graphics Hardware,” J. Mech. Eng. Sci., 221(12), pp. 1745–1748. [CrossRef]
Brandvik, T., and Pullan, G., 2009, “An Accelerated 3D Navier-Stokes Solver for Flows in Turbomachines,” ASME Turbo Expo 2009, Orlando, FL, June 8–12, Paper No. GT2009-60052. [CrossRef]
Corrigan, A., Camelli, F., Löhner, R., and Wallin, J., 2009, “Running Unstructured Grid CFD Solvers on Modern Graphics Hardware,” 19th AIAA Computational Fluid Dynamics Conference, July, Paper No. AIAA-2009-4001.
Corrigan, A., Camelli, F., Löhner, R., and Mut, F., 2012, “Semi-Automatic Porting of a Large-Scale FORTRAN CFD Code to GPUs,” Int. J. Numer. Methods Fluids, 69, pp. 314–331. [CrossRef]
Antoniou, A. S., Karantasis, K. I., Polychronopoulos, E. D., and Ekaterinaris, J. A., 2010, “Acceleration of a Finite-Difference WENO Scheme for Large-Scale Simulations on Many-Core Architectures,” 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, Orlando, FL, Jan. 4–7.
Cohen, J. M., and Molemaker, M. J., 2009, “A Fast Double Precision CFD Code Using CUDA,” 21st International Conference on Parallel Computational Fluid Dynamics.
Thibault, J., and Senocak, I., 2009, “CUDA Implementation of a Navier-Stokes Solver on Multi-GPU Desktop Platforms for Incompressible Flows,” 47th AIAA Aerospace Sciences Meeting, Jan. 5–8, Paper No. AIAA 2009-758.
Jacobsen, D., Thibault, J., and Senocak, I., 2010, “An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computation on Multi-GPU Clusters,” AIAA Aerospace Sciences Meeting, Reno, NV, January.
DeLeon, R., Jacobsen, D., and Senocak, I., 2012, “Large Eddy Simulations of Turbulent Incompressible Flows on GPU Clusters,” 50th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, pp. 1–13.
Griebel, M., and Zaspel, P., 2010, “A Multi-GPU Accelerated Solver for the Three-Dimensional Two-Phase Incompressible Navier-Stokes Equations,” Comput. Sci. Res. Dev., 25, pp. 65–73. [CrossRef]
Kelly, J., 2009, “GPU-Accelerated Simulation of Two-Phase Incompressible Fluid Flow Using a Level-Set Method for Interface Capturing,” ASME 2009 International Mechanical Engineering Congress and Exposition (IMECE2009), Lake Buena Vista, FL, Nov. 13–19, Paper No. IMECE2009-13330, pp. 2221–2228. [CrossRef]
Jespersen, D. C., 2009, “Acceleration of a CFD Code With a GPU,” NASA Technical Report No. NAS-09-003.
Buning, P. G., Jesperson, D. E., Pulliam, T. H., Chan, W. M., Slotnick, J. P., Krist, S. E., and Renze, K. J., 1998, OVERFLOW User's Manual- version 1.8, NASA Langley Research Center, Hampton, VA.
Phillips, E. H., Zhang, Y., Davis, R. L., and Owens, J. D., 2009, “Rapid Aerodynamic Performance Prediction on a Cluster of Graphics Processing Units,” 47th AIAA Aerospace Sciences Meeting, Reno, NV, January.
Phillips, E. H., Davis, R. L., and Owens, J. D., 2010, “Unsteady Turbulent Simulations on a Cluster of Graphics Processors,” 40th AIAA Fluid Dynamics Conference, June, Paper No. AIAA 2010-5036.
Asouti, V. G., Trompoukis, X. S., Kampolis, J. C., and Giannakoglou, K. C., 2011, “Unsteady CFD Computations Using Vertex-Centered Finite Volumes for Unstructured Grids on Graphics Processing Units,” Int. J. Numer. Methods Fluids, 67, pp. 232–246. [CrossRef]
Kanpolis, J. C., Trompoukis, X. S., Asouti, V. G., and Giannakoglou, K. C., 2010, “CFD Based Analysis and Two-Level Aerodynamic Optimization on Graphics Processing Units,” Comput. Methods Appl. Mech. Eng., 199, pp. 712–722. [CrossRef]
Turek, S., Becker, C., and Kilian, S., 2003, “Hardware-Oriented Numeric and Concepts for PDE Software,” FGCS, Future Gener. Comput. Syst., 22, pp. 217–238. [CrossRef]
Strzodka, R., Doggett, M., and Kolb, A., 2005, “Scientific Computation for Simulations of Programmable Graphics Hardware,” Simul. Model. Pract. Theory, 13, pp. 667–680. [CrossRef]
Patnaik, G., and Obenschain, K. S., 2010, “Using GPU on HPC Applications to Satisfy Low-Power Computational Requirements,” 48th AIAA Aerospace Sciences Meeting, Orlando, FL, January, Paper No. AIAA-2010-524. [CrossRef]
Corrigan, A., and Lohner, R., 2011, “Porting of FEFLO to Multi-GPU Clusters,” 49th AIAA Aerospace Sciences Conference, Orlando, FL, Paper No. 2011-0948. [CrossRef]
Klockner, A., Warburton, T., Bridge, J., and Hesthaven, J. S., 2009, “Nodal Discretization Galerkin Methods on Graphics Processors,” J. Comput. Phys., 228, pp. 7863–7882. [CrossRef]
Fatica, M., Jameson, A., and Alonso, J., 2004, “Stream-FLO: An Euler Solver for Streaming Architectures,” AIAA Paper No. AIAA 2004-1090.
Wang, P., Abel, T., and Kaehler, R., 2010, “Adaptive Mesh Fluid Simulations on GPU,” New Astron., 15(7), pp. 581–589. [CrossRef]
Liang, W. Y., Hsieh, T. J., Satria, M., Chang, Y. L., Fang, J. P., Chen, C. C., and Han, C. C., 2009, “A GPU-Based Simulation of Tsunami Propagation and Inundation,” Lect. Notes Comput. Sci., 5574, pp. 593–603. [CrossRef]
Mossaiby, F., Rossi, R., Dadvand, P., and Idelsohn, S., 2012, “OpenCL-Based Implementation of an Unstructured Edge-Based Finite Element Convection-Diffusion Solver on Graphics Hardware,” Int. J. Numer. Methods Eng., 89, pp. 1635–1651. [CrossRef]
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J., and Skadron, K., 2008, “A Performance Study of General-Purpose Applications on Graphics Processors Using Cuda,” J. Parallel Distrib. Comput., 68(10), pp. 1370–1380. [CrossRef]
Li, W., Wei, X., and Kaufman, A., 2003, “Implementing Lattice Boltzmann Computation on Graphics Hardware,” Visual Comput., 19, pp. 444–456. [CrossRef]
Kaufman, A., Fan, Z., and Petkov, K., 2009, “Implementing the Lattice Boltzmann Model on Commodity Graphics Hardware,” J. Stat. Mech., 2009, p. P06016. [CrossRef]
Fan, Z., Kuo, Y., Zhao, Y., Qiu, F., Kaufman, A., and Arcieri, W., 2009, “Visual Simulation of Thermal Fluid Dynamics in a Pressurized Water Reactor,” Visual Comput., 25(11), pp. 985–996. [CrossRef]
Tolke, J., 2010, “Implementation of a Lattice Boltzmann Kernel Using the Compute Unified Device Architecture Developed by NVIDIA,” Comput. Visualization Sci., 13, pp. 29–39. [CrossRef]
Tolke, J., and Krafczyk, M., 2008, “Teraflop Computing on a Desktop PC With GPUs for 3D CFD,” Int. J. Comput. Fluid Dyn., 22(7), pp. 443–456. [CrossRef]
Bailey, P., Myre, J., Walsh, S. D. C., Lilja, D. J., and Saar, M. O., 2009, “Accelerating Lattice Boltzmann Fluid Flow Simulations Using Graphics Processors,” International Conference on Parallel Processing, Vienna Austria.
Feichtinger, C., Habich, J., Kostler, H., Hager, G., Rude, U., and Wellein, G., 2011, “A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU–CPU Clusters,” Parallel Comput., 37(9), pp. 536–549. [CrossRef]
Obrecht, C., Kuznik, F., Tourancheau, B., and Roux, J. J., 2011, “A New Approach to the Lattice Boltzmann Method for Graphics Processing Units,” Comput. Math. Appl., 61(12), pp. 3628–3638. [CrossRef]
Peng, L., Nomura, K., Oyakawa, T., Kalia, R., Nakano, A., and Vashishta, P., 2008, “Parallel Lattice Boltzmann Flow Simulation on Emerging Multi-Core Platforms,” Lect. Notes Comput. Sci., 5168, pp. 763–777. [CrossRef]
Alam, M. S., and Cheng, L., 2011, “Parallelization of LBM Code Using CUDA Capable GPU Platform for 3D Single and Two-Sided Non-Facing Lid-Driven Cavity Flow,” Proceedings of the ASME 2011 30th International Conference on Ocean, Offshore and Arctic Engineering (OMAE2011), Rotterdam, The Netherlands, June 19–24, pp. 745–753. [CrossRef]
“Sailfish Reference Manual,” Sailfish, http://sailfish.us.edu.pl/index.html
Rustico, E., Bilotta, G., Gallo, G., Herault, A., and Del Negro, C., 2012, “Smoothed Particle Hydrodynamics Simulations on Multi-GPU Systems,” 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). [CrossRef]
Anderson, J. A., Lorenz, C. D., and Travesset, A., 2008, “General Purpose Molecular Dynamics Simulations Fully Implemented on Graphics Processing Units,” J. Comput. Phys., 227, pp. 5342–5359. [CrossRef]
Marsh, D., 2010, “Molecular Dynamics-Lattice Boltzmann Hybrid Method on Graphics Processors,” M.S. thesis, University of Illinois at Urbana-Champaign, Champaign, IL.
Sahu, K., and Vanka, S. P., 2011, “A Multiphase Lattice Boltzmann Study of Buoyancy-Induced Mixing in a Tilted Channel,” Comput. Fluids, 50(1), pp. 199–215. [CrossRef]
He, X., Zhang, R., Chen, S., and Doolen, G. D., 1999, “On the Three-Dimensional Rayleigh-Taylor Instability,” Phys. Fluids, 11(5), pp. 1143–1152. [CrossRef]
Redapangu, P., Vanka, S. P., and Sahu, K., 2012, “Multiphase Lattice Boltzmann Simulations of Buoyancy Induced Flow of Two Immiscible Fluids With Different Viscosities,” Eur. J. Mech., B/Fluids, 34, pp. 105–114. [CrossRef]
Redapangu, P., Sahu, K. C., and Vanka, S. P., 2012, “A Study of Pressure-Driven Displacement Flow of Two Immiscible Liquids Using a Multiphase Lattice Boltzmann Approach,” Phys. Fluids, 24(10), p. 102110. [CrossRef]
Wang, G., Cope, W. K., and Vanka, S. P., 1994, Multigrid Calculations of Twin Jet Impingement With Crossflow: Comparison of Segregated and Coupled Relaxation Strategies, Vol. 196, American Society of Mechanical Engineers, Fluids Engineering Division (Publication) FED, New York, pp. 233–244.
Shinn, A. F., and Vanka, S. P., 2009, “Implementation of a Semi-Implicit Pressure-Based Multigrid Fluid Flow Algorithm on a Graphics Processing Unit,” Proceedings of the ASME (IMECE 2009), Lake Buena Vista, FL, pp. 125–133. [CrossRef]
Shinn, A. F., Vanka, S. P., and Hwu, W. W., 2010, “Direct Numerical Simulation of Turbulent Flow in a Square Duct Using a Graphics Processing Unit (GPU),” 40th AIAA Fluid Dynamics Conference. [CrossRef]
Shinn, A. F., and Vanka, S. P., 2013, “Large Eddy Simulations of Film-Cooling Flows With a Micro-Ramp Vortex Generator,” ASME J. Turbomach., 135(1), p. 011004. [CrossRef]
Chaudhary, R., Vanka, S. P., and Thomas, B. G., 2010, “Direct Numerical Simulations of Magnetic Field Effects on Turbulent Flow in a Square Duct,” Phys. Fluids, 22(7), p. 075102. [CrossRef]
Chaudhary, R., Thomas, B. G., and Vanka, S. P., 2012, “Effect of Electromagnetic Ruler Braking (EMBr) on Transient Turbulent Flow in Continuous Slab Casting Using Large Eddy Simulations,” Metall. Mater. Trans. B, 43(3), pp. 532–553. [CrossRef]
Chaudhary, R., Vanka, S. P., and Thomas, B. G., 2011, “Direct Numerical Simulations of Transverse and Spanwise Magnetic Field Effects on Turbulent Flow in a 2:1 Aspect Ratio Rectangular Duct,” Comput. Fluids, 51(1), pp. 100–114. [CrossRef]
Vanka, S. P., Shinn, A. F., and Sahu, K. C., 2011, “Computational Fluid Dynamics Using Graphics Processing Units: Challenges and Opportunities,” Proceedings of the ASME 2011 IMECE Conference, Denver, CO, pp. 429–437. [CrossRef]
Nicoud, F., and Ducros, F., 1999, “Subgrid-Scale Stress Modelling Based on the Square of the Velocity Gradient Tensor,” Flow, Turbul. Combust., 62(3), pp. 183–200. [CrossRef]
Shinn, A. F., 2011, “Large Eddy Simulations of Turbulent Flows on Graphics Processing Units: Application to Film-Cooling Flows,” Ph.D thesis, University of Illinois at Urbana-Champaign, Champaign, IL.
Chaudhary, R., 2011, “Studies of Turbulent Flows in Continuous Casting of Steel With and Without Magnetic Field,” Ph.D. thesis, University of Illinois at Urbana-Champaign, Champaign, IL.
Zaman, K. B. M. Q., Rigby, D. L., and Heidman, J. D., 2010, “Inclined Jet in Crossflow Interacting With a Vortex Generator,” J. Propul. Power, 26(5), pp. 947–954. [CrossRef]
Timmel, K., Eckert, S., and Gerbeth, G., 2011, “Experimental Investigation of the Flow in a Continuous-Casting Mold Under the Influence of a Transverse Direct Current Magnetic Field,” Metall. Mater. Trans. B, 42(1), pp. 68–80. [CrossRef]
Timmel, K., Miao, X., Eckert, S., Lucas, D., and Gerbeth, G., 2010, “Experimental and Numerical Modeling of the Steel Flow in a Continuous Casting Mould Under the Influence of a Transverse DC Magnetic Field,” Magnetohydrodynamics, 46(4), pp. 337–448.
Lee, V., Kim, C., Chuggani, J., Deisher, M., Kim, D., Nguyen, A., Satish, N., Smelyansky, M., Chennupaty, S., Hammarlund, P., Singhal, R., and Dubey, P., 2010, “Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU,” ISCA 10, Saint-Malo, France, June 19–23. [CrossRef]


Grahic Jump Location
Fig. 1

Evolution of computing power of CPU and GPU [6]

Grahic Jump Location
Fig. 2

Correspondence between GPU grid and computational mesh

Grahic Jump Location
Fig. 3

Instantaneous contours of streamwise velocity in turbulent flow through ducts of different shapes. Reynolds numbers ∼4000.

Grahic Jump Location
Fig. 4

(a) Instantaneous nondimensional temperature (T − T)/(T − Ts) at midspan for a film cooling jet. (b) Instantaneous nondimensional temperature (T − T)/(T − Ts) at midspan for microramp interacting with a film cooling jet.

Grahic Jump Location
Fig. 5

Instantaneous nondimensional axial velocity contours with secondary velocity vectors for a square duct with a magnetic field

Grahic Jump Location
Fig. 6

Mean nondimensional axial velocity contours with secondary velocity vectors for a square duct with a magnetic field

Grahic Jump Location
Fig. 7

Instantaneous contours of total velocity (m/s) with different magnetic field positions

Grahic Jump Location
Fig. 8

Evolution of the isosurface of ϕ at the interface at different times (t = 12, 18, 30, and 50) for parameters Re = 100, viscosity ratio = 10, At = 0.2, Fr = 5, κ = 0.005, and angle = 45 degrees

Grahic Jump Location
Fig. 9

Contours of the index function at t = 20 for a midspan plane for Re = 100, At = 0.2, viscosity ratio of 10, Fr = 5, and κ = 0.005

Grahic Jump Location
Fig. 10

Initial position of a deformable drop in a square duct

Grahic Jump Location
Fig. 11

Transient deformation of the droplet surface for different capillary numbers



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In