Tek Kart Bilgisayarlar ile Bulut Oluşturarak MapReduce İşlemleri Denemesi

Levent AYSAN, İzzet Gökhan ÖZBİLGİN
706 154

Abstract


Günümüzde bilişim sistemlerinde geçmişe oranla çok daha büyük veriler oluşmaktadır. Son iki yılda oluşan veri miktarı tüm verinin %90’ı kadardır. Bu verilerin depolanması ve analizinde önemli kaynak sorunları yaşanmaktadır. Büyük Verinin depolanması, işlenmesi ve analiz edilmesi için ihtiyaç duyulan sistemlerin, güncel sistemlerden daha hızlı çalışması ve daha az enerji tüketmesi gerekmektedir. Aksi takdirde çok büyük maliyet ve veri analiz süreleri önümüze çıkmaktadır. Bu çalışmada tek kart mini kişisel bilgisayarlar ile küme oluşturup ve üzerinde kab tabanlı sanallaştırma sağlayıp büyük veri algortimaları çalışmaları yapılmıştır. Bu kapsamda oluşturulan çalışma genel itibariyle büyük veri sistemlerinin temelini oluşturan Map Reduce işlemlerinin özel olarak tasarlanmış ARM işlemci kümeleri üzerinde yürütülmesini ve etkinliğinin test edilmesi araştırılmıştır. Çalışma ile maliyeti ucuz ve enerji tüketimi ve karbon salınımı düşük arm işlemcili tek kart mini bilgisayarların bulut bilişimde tercih edilmesinin bir çok avantaja sahip olabileceğini ispatlanmaya çalışılmıştır.

Keywords


Bulut Bilişim, parallel tek kart bilgisayar kümesi, büyük veri, mapreduce, hadoop



DOI: http://dx.doi.org/10.17671/btd.88292

References


Y. Kaplan, Bulut Bilişim ve İş Sürekliliği, Telepati Telekomünikasyon 183 Rapor, Türkiye, 2010.

Y. Korkmaz, Bulut Bilişim: Türkiye İçin Fırsatlar TÜBİTAK – UEKAE, Türkiye, 2008.

Y. Zhao, I. Raicu, S.Lu , Cloud Computing and Grid Computing 360-Degree Compared, ,Texas,ABD, 2008

B.Emily,M. Jaikrishnan, S. Karthikeyan, Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures, Univ Wisconsin Madison, WI, ABD,2013

Arm Company Profile, http://arm.com/about/company-profile/ ,02015

Seal, David. ARM Architecture Reference Manual, (2001)

Y. Liu,H. Zhenjiang, K. Matsuzaki. Towards Systematic Parallel Programming over MapReduceConference Proceedings, 483-485, 02015

M. Owen, TeraByte Sort on Apache Hadoop,Kalifornia,ABD,2008

R. L¨ammel , Google’s MapReduce programming model ,2008

Adapteva Parallella Manual, http://www.parallella.org/docs /parallella_manual.pdf ,20.01.2015

CoreMark Scores http://www.eembc.org ,26.03.2015

K. Freund, Redefining Datacenter Efficiency,Calxeda,2012 benchmarks -for-calxedas-5-watt-web-server ,01.01.2015 res2011q3/power_ssj2008-20110806-00392.html ,07.01.2015

Deployments, http://www.accenture.com/ sitecollectiondocuments/ pdf/accenture-hadoop-deployment-comparison-study.pdf ,20.20.2014 Comparison Study CloudBased

Accenture Hadoop on Cloud, http://www.accenture.com/ Site CollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop- Deployments-Benefits-and-Considerations.pdf ,19.04.2015 hardware-update ,16.02.2015

Parallella Soft, http://elinux.org/Parallella_Software ,07.01.2015 Parallella-hw, ,02015

Adapteva Referans Tasarım, http://www.adapteva.com/white- papers/parallella-platform-reference-design ,20.01.2015

Linaro Open source for ARM-SOC http://www.linaro.org ,02015

/technology/high-speed-serial.html ,02.04.2015 http://www.xilinx.com/products

Multi-Gigabit Transceiver, http://en.wikipedia.org/wiki/Multi- gigabit_transceiver ,20.01.2015

Apache Hadoop https://hadoop.apache.org ,07.01.2015

Apache Hadoop NextGen MapReduce (YARN), https://hadoop. apache.org/docs/r2.7.0/hadoop-yarn/hadoop-yarn-site/YARN.html api/org/apache/hadoop/ examples/terasort/package-summary.html ,02015 http://hadoop.apache.org/docs/current/

General Technical Discuss, https://forums.xilinx.com/t5/ General-Technical-Discussion/bd-p/GenDis ,07.01.2015 EK-1

TestDFSIO -read overall - - - - - 1287 112.863 TestDFSIO -read - -nrFiles 10 -fileSize 1 - - - - 35390 6290

TestDFSIO -write overall - - - - - 2731 111.845 TestDFSIO -write - -nrFiles 10 -fileSize 1 - - - - 45140 8820

Kurulum İşlem Adımları (Operation Steps) Linaro Kurulumu (Linaro Installation) Parallella releases.linaro.org/14.06/ubuntu/trusty-images/developer 6tar.gz sd kartlara boot edilebilir halde yazılarak kurulum yapılmıştır. adresinden olarak kullanılmıştır. indirilen linaro-trusty-developer-20140623

Spent 16ms computing TeraScheduler splits. Computing input splits took 1789ms

Sampling 2 splits of 2 Making 1 from 100 sampled records Computing parititions took 1380ms

File System Counters FILE: Number of bytes read=10406

FILE: Number of bytes written=340350

FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202

HDFS: Number of bytes written=10000

HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=92783

Total time spent by all reduces in occupied slots (ms)=20735

Total time spent by all map tasks (ms)=92783

Total time spent by all reduce tasks (ms)=20735

Total vcore-seconds taken by all map tasks=92783

Total vcore-seconds taken by all reduce tasks=20735

Total megabyte-seconds taken by all map tasks=95009792

Total megabyte-seconds taken by all reduce tasks=21232640 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200

Map output materialized bytes=10412 Input split bytes=202

Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412

Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=3517

CPU time spent (ms)=7530

Physical memory (bytes) snapshot=375734272

Virtual memory (bytes) snapshot=1086078976

Total committed heap usage (bytes)=256647168 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0

File Input Format Counters Bytes Read=10000

File Output Format Counters Bytes Written=10000 15/04/01 13:59:59 INFO terasort.TeraSort: done terasort Ref hduser@hadoop3:~$ yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar terasort

File System Counters FILE: Number of bytes read=10406

FILE: Number of bytes written=340350

FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10202

HDFS: Number of bytes written=10000

HDFS: Number of read operations=9 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=64639

Total time spent by all reduces in occupied slots (ms)=32219

Total time spent by all map tasks (ms)=64639

Total time spent by all reduce tasks (ms)=32219

Total vcore-seconds taken by all map tasks=64639

Total vcore-seconds taken by all reduce tasks=32219

Total megabyte-seconds taken by all map tasks=66190336

Total megabyte-seconds taken by all reduce tasks=32992256 Map-Reduce Framework Map input records=100 Map output records=100 Map output bytes=10200

Map output materialized bytes=10412 Input split bytes=202

Combine input records=0 Combine output records=0 Reduce input groups=100 Reduce shuffle bytes=10412

Reduce input records=100 Reduce output records=100 Spilled Records=200 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=50 CPU time spent (ms)=1100

Physical memory (bytes) snapshot=714772480

Virtual memory (bytes) snapshot=2177273856

Total committed heap usage (bytes)=603979776 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0

File Input Format Counters Bytes Read=10000

File Output Format Counters Bytes Written=10000 15/03/31 14:38:03 INFO terasort.TeraSort: done




Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.