¼øõÇâ´ëÇб³ ÄÄÇ»ÅÍ°øÇаú ÀÌ»óÁ¤

[ºòµ¥ÀÌÅÍ ÀÌÇØ] 
                                                                        

  • °­ÀǸñÇ¥
     ºòµ¥ÀÌÅÍ ÄÄÇ»ÆÃÀÇ ±âº» °³³ä, ¿ø¸® ¹× ÀÀ¿ë ±â¹ýÀ» °­ÀÇÇÑ´Ù. ÁÖ¿ä ³»¿ëÀ¸·Î´Â ºò µ¥ÀÌÅÍÀÇ ±âº» °³³äÀ» ¼Ò°³ÇÏ°í, VirtualBox »óÀÇ Å¬·¯½ºÅÍ¿¡¼­ ÇÏµÓ ºÐ»ê ÆÄÀÏ ½Ã½ºÅÛ°ú ¸Ê¸®µà½ºÀÇ ºòµ¥ÀÌÅÍ Ã³¸®, ÆÄÀ̽㠽ºÆÄÅ©(PySpark)¸¦ »ç¿ëÇÑ ºÐ»ê µ¥ÀÌÅÍ Ã³¸® ¹× ºÐ¼® ±â¹ý µîÀ» °­ÀÇ, ½Ç½ÀÇÏ°í Àû¿ë »ç·Ê¸¦ »ìÆì º»´Ù.

 

 

°­Àdz»¿ë

Âü°í »çÀÌÆ®

Âü°í ÀÚ·á

0. °­ÀÇ ¼Ò°³

 

1. ºòµ¥ÀÌÅÍ ÄÄÇ»ÆüҰ³

GFS ³í¹®, Bigtable ³í¹®

2. ¾ÆÆÄÄ¡ ÇÏµÓ ¼Ò°³

 

3-1. Ŭ·¯½ºÅÍ ½Ç½À ȯ°æ
  3-2.
¸®´ª½º ¸í·É °³¿ä

VirtualBox, ubuntu

 

4. ÇÏµÓ Å¬·¯½ºÅÍ °³¿ä ¹× ¼³Ä¡

 

 

5. ¸Ê¸®µà½º ¼Ò°³

MapReduce ³í¹®

6. ¸Ê¸®µà½º ÀÀ¿ë ±¸Ãà

[MAPR Academy] Build Hadoop MapReduce Applications

 

7. ¾ÆÆÄÄ¡ ½ºÆÄÅ© ¼Ò°³

½ºÆÄÅ© ³í¹®

8. ½ºÆÄÅ© µ¥ÀÌÅÍÇÁ·¹ÀÓ ¿¬»ê

[Spark by {Examples}PySpark Tutorial

½ºÆÄÅ© SQL ³í¹®

9. SFPD ÀÀ¿ë ¿¹

[Carol McDonald] sparkdataframeexample

10. ´Ü¾î Ä«¿îÆ® ÀÀ¿ë ¿¹

[Databricks] A simple word count application

 

11. ¹Ì±¹ ÁÖ½Ä ÀÀ¿ë ¿¹

[towards data science Introduction to PySpark using US Stock Price Data

 

12. ½ºÆÄÅ© ÀÀ¿ë±¸Ãà

 

 

13. ½ºÆÄÅ© ½ºÆ®¸®¹Ö

Apache Spark Structured Streaming

 

14. ¼¾¼­ µ¥ÀÌÅÍ ÀÀ¿ë ¿¹

[Carol McDonald] SparkStreamingHBaseExample

 

  15. ½ºÆÄÅ© ÀÀ¿ë ¸ð´ÏÅ͸µ

 

 

16. ½ºÆÄÅ© ¸Ó½Å ·¯´× ¼Ò°³

 

 

17. ¼Òµæ ¿¹Ãø ºÐ¼® ¿¹

[databricks] Getting started with MLlib - binary classification example

 

18. ¿µÈ­ Ãßõ ¿¹

[towards data science] Movie Recommendation with Collaborative Filtering in Pyspark

 

19. ½ºÆÄÅ© ±×·¡ÇÁÇÁ·¹ÀÓ

 

 

20. Ç×°ø±â ¿îÇ× ¿¹

Analyzing Flight Delays with Apache Spark GraphFrames and MapR Database

 

 

¡¤         Âü°í»çÀÌÆ®
http://hadoop.apache.org/              ¾ÆÆÄÄ¡ ÇϵÓ
http://spark.apache.org/               ¾ÆÆÄÄ¡ ½ºÆÄÅ©
https://github.com/apache/spark        Git ½ºÆÄÅ© ÀúÀå¼Ò
https://research.google.com/           ±¸±Û ¸®¼­Ä¡

https://www.data.go.kr/                °ø°ø µ¥ÀÌÅÍ Æ÷ÅÐ
https://grouplens.org/
                ¹Ì³×¼ÒŸ ´ëÇÐ GroupLens, ¿µÈ­ µ¥ÀÌÅÍ ¼¼Æ® Á¦°ø
http://archive.ics.uci.edu/ml          UCI Machine Learning Repository, ±â°èÇнÀ µ¥ÀÌÅÍ ¼¼Æ® Á¦°ø
https://physionet.org/physiobank/      »ýü½ÅÈ£ ¹× °ü·Ã µ¥ÀÌÅÍ Á¦°ø

¡¤         Æò°¡: Ãâ¼® 10%, °úÁ¦ ¹× ¹ßÇ¥ 50%, ½ÃÇè 30%