D. Pao and Z. Lu, "A multi-pipeline architecture for high-speed packet classification," Computer Communications,vol. 54, pp. 84-96, 2014.
 B. S. Tumari and W. Lakshmipriya, "FPGA Implementation of Binary-tree-based High Speed Packet ClassificationSystem," International Journal of Combined Research & Development (IJCRD ),vol. 2, pp. 17-22, 2014.
 K. Zheng, H. Che, Z. Wang and B. Liu, "TCAM-based distributed parallel packet classification algorithm with range-matching solution," in INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings IEEE, pp. 293-303, 2005.
 K. Zheng, H. Che, Z. Wang, B. Liu and X. Zhang, "DPPC-RE: TCAM-based distributed parallel packet classification with range encoding," Computers, IEEE Transactions on,vol. 55, pp. 947-961, 2006.
 Z. Cao, M. Kodialam and T. Lakshman, "Traffic steering in software defined networks: planning and online routing," ACM SIGCOMM Computer Communication Review - SIGCOMM'14,vol. 44, pp. 65-70, 2014.
 K. Guerra Perez, X. Yang, S. Scott-Hayward and S. Sezer, "A configurable packet classification architecture for Software-Defined Networking," in 27th IEEE International System-on-Chip Conference (SOCC), pp. 353-358, 2014.
 S. Han, K. Jang and S. Moon, "PacketShader: a GPU-accelerated software router," ACM SIGCOMM Computer Communication Review,vol. 41, pp. 195-206, 2011.
 K. G. Perez, X. Yang, S. Scott-Hayward and S. Sezer, "Optimized packet classification for Software-Defined Networking," in International Conference on Communications (ICC), IEEE, pp. 859-864, 2014.
 D. E. Taylor, "Survey and taxonomy of packet classification techniques," ACM Computing Surveys (CSUR), vol. 37, pp. 238-275, 2005.
 S. Zhou, Y. R. Qu and V. K. Prasanna, "Multi-core implementation of decomposition-based packet classification algorithms," in Parallel Computing Technologies, vol. 7979, ed: Springer, pp. 105-119, 2013.
 V. Srinivasan, S. Suri and G. Varghese, "Packet classification using tuple space search," ACM SIGCOMM Computer Communication Review,vol. 29, pp. 135-146, 1999.
 H. Lim, Y. Choe, M. Shim and J. Lee, "A Quad-Trie Conditionally Merged with a Decision Tree for Packet Classification," Communications Letters, IEEE,vol. 18, pp. 676 - 679, 2014.
 سعید پارسا و محمد حمزهئی، «کاشیبندی حلقههای تودرتو با در نظر گرفتن محلیت دادهها بهمنظور اجرای موازی بر روی پردازندههای چندهستهای»، مجله مهندسی برق دانشگاه تبریز، جلد 45، شماره 3، صفحه 17-26، 1394.
 H. Lim, S. Lee and E. E. Swartzlander Jr, "A new hierarchical packet classification algorithm," Computer Networks,vol. 56, pp. 3010-3022, 2012.
 NVIDIA. NVIDIA CUDA Compute Unified Device Architecture Programming Guide, version 6.5, August 2015, http://docs.nvidia.com/cuda/pdf/CUDA_C_ Programming_Guide.pdf
 AMD: Global Provider of Innovative Graphics, Processors, August 2015 Available: http://www.amd.com
 Y. Li, D. Zhang, A. X. Liu and J. Zheng, "GAMT: a fast and scalable IP lookup engine for GPU-based software routers," in Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems, pp. 1-12, 2013.
 T.-H. Li, H.-M. Chu and P.-C. Wang, "IP address lookup using GPU," in 14th International Conference on High Performance Switching and Routing (HPSR), IEEE, pp. 177-184, 2013.
 R. S. Sinha, S. Singh, S. Singh and V. K. Banga, "Speedup Genetic Algorithm Using C-CUDA," in Fifth International Conference on Communication Systems and Network Technologies (CSNT), pp. 1355-1359, 2015.
 M. Beyeler, N. Oros, N. Dutt and J. L. Krichmar, "A GPU-accelerated cortical neural network model for visually guided robot navigation," Neural Networks, In Press,2015.
 Y. Lu, Y. Zhu, M. Han, J. S. He and Y. Zhang, "A survey of GPU accelerated SVM," in Proceedings of the 2014 ACM Southeast Regional Conference, pp. 15-23, 2014.
 P. Przymus and K. Kaczmarski, "Dynamic compression strategy for time series database using GPU," in New Trends in Databases and Information Systems, ed: Springer, pp. 235-244, 2014.
 G. Vasiliadis, E. Athanasopoulos, M. Polychronakis and S. Ioannidis, "PixelVault: Using GPUs for securing cryptographic operations," in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1131-1142, 2014.
 C.-L. Hung, Y.-L. Lin, K.-C. Li, H.-H. Wang and S.-W. Guo, "Efficient GPGPU-based parallel packet classification," in Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 1367-1374, 2011.
 A. Nottingham and B. Irwin, "GPU packet classification using OpenCL: a consideration of viable classification methods," in Proceedings of the 2009 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists, pp. 160-169, 2009.
 Y. Deng, X. Jiao, S. Mu, K. Kang and Y. Zhu, "NPGPU: Network Processing on Graphics Processing Units," in Theoretical and Mathematical Foundations of Computer Science, ed: Springer, pp. 313-321, 2011.
 K. Kang and Y. S. Deng, "Scalable packet classification via GPU metaprogramming," in Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1-4, 2011.
 D. E. Taylor and J. S. Turner, "Classbench: A packet classification benchmark," in INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies, pp. 2068-2079, 2005.
 S. Zhou, S. G. Singapura and V. K. Prasanna, "High-Performance Packet Classification on GPU," in High Performance Extreme Computing Conference (HPEC), pp. 1-6, 2014.
 M. Varvello, R. Laufer, F. Zhang and T. Lakshman, "Multi-Layer Packet Classification with Graphics Processing Units," in Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies, pp. 109-120, 2014.
 J. Zheng, D. Zhang, Y. Li and G. Li, "Accelerate Packet Classification Using GPU: A Case Study on HiCuts," in Computer Science and its Applications, ed: Springer, pp. 231-238, 2015.
 احسان اولیائی ترشیزی و حسین شریفی، «ارائه دو الگوریتم دیکدینگ هیبرید جدید با عملکرد بسیار خوب و پیچیدگی بسیار کم برای دیکدینگ کدهایLDPC»، مجله مهندسی برق دانشگاه تبریز، جلد 45، شماره 4، صفحه 27-37، 1394.
 L. Ma, K. Agrawal and R. D. Chamberlain, "A memory access model for highly-threaded many-core architectures," Future Generation Computer Systems, vol. 30, pp. 202-215, 2014.
 J. S. Kirtzic, O. Daescu, "A parallel algorithm development model for the GPU architecture," in Proc. of International Conf. on Parallel and Distributed Processing Techniques and Applications, pp. 1-9, 2012.
 M. Amarıs, D. Cordeiro, A. Goldman and R. Y. de Camargo, "A Simple BSP-based Model to Predict Execution Time in GPU Applications," in 22nd annual IEEE International Conference on High Performance Computing (HiPC 2015), pp. 285-294, 2015.
 K. Nakano, "The hierarchical memory machine model for GPUs," in IEEE 27th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pp. 591-600, 2013.
 K. Nakano, "Simple memory machine models for GPUs," International Journal of Parallel, Emergent and Distributed Systems,vol. 29, pp. 17-37, 2014.
 S. A. Haque and N. Xie, "A many-core machine model for designing algorithms with minimum parallelism overheads," in arXiv preprint arXiv:1402.0264,2014.
 S. Hong and H. Kim, "An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness," in ACM SIGARCH Computer Architecture News, pp. 152-163, 2009.
 W. Liu, W. Müller-Wittig and B. Schmidt, "Performance predictions for general-purpose computation on GPUs," in International Conference on Parallel Processing (ICPP), pp. 50-50, 2007.
 L. Ma, R. D. Chamberlain and K. Agrawal, "Performance modeling for highly-threaded many-core GPUs," in 25th International Conference on Application-specific Systems, Architectures and Processors (ASAP), pp. 84-91, 2014.
 S. H. Roosta, Parallel processing and parallel algorithms: theory and computation, Springer Science & Business Media, 2012.
 D. A. Jacobsen, J. C. Thibault and I. Senocak, "An MPI-CUDA implementation for massively parallel incompressible flow computations on multi-GPU clusters," in 48th AIAA aerospace sciences meeting and exhibit, pp. 1-16, 2010.
 M. Bernaschi, M. Bisson and M. Fatica, "Colloquium: Large scale simulations on GPU clusters," The European Physical Journal B,vol. 88, pp. 1-10, 2015.