[B! mapreduce] at_yasuのブックマーク

mapreduceに関するat_yasuのブックマーク (20)

GitHub - twitter/summingbird: Streaming MapReduce with Scalding and Storm
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
at_yasu 2014/01/29
ほぉ

twitter

mapreduce

scala
リンク
TwitterがMapReduceストリーミングフレームワークSummingbirdをオープンソースに
Spring BootによるAPIバックエンド構築実践ガイド第2版何千人もの開発者が、InfoQのミニブック「Practical Guide to Building an API Back End with Spring Boot」から、Spring Bootを使ったREST API構築の基礎を学んだ。この本では、出版時に新しくリリースされたバージョンである Spring Boot 2 を使用している。しかし、Spring Boot3が最近リリースされ、重要な変...
at_yasu 2014/01/29
ほぉ

mapreduce

oss

twitter
リンク
MapReduceは今後どうなるのか？ - 急がば回れ、選ぶなら近道
2012年の現在、割と悩んでいるのでメモっておく。年度末ぐらいに再調査の予定。・・なので暫定ですよ。まず前提として、現行のHadoopの実行フレームワークであるMapReduceは、実行効率は決して良くはないです。この辺が割と辛い。とはいえ、大規模並列処理を一般的に行うという観点での品質や取り回しを考えた場合、”結果として”非常にバランスがとれており、普及している。その上で、このMapReduceですが、今後の見通しについては、潮流は今のところ二つに割れているよう見える。ので、その辺のメモ。 ■YARN 一つの方向性は、現在のHadoop2.0系で実装されているMapReduce2.0、というか、MapReduceとは別の実行基盤を利用するという方向ですね。すなわちBSPや、MPIを利用する。要は、今までの並列処理の成果をそのまま利用しましょう、という流れに近い。 MapReduce
at_yasu 2012/10/09
mapreduce

algorithms

programming
リンク
Hadoop summit 2012 report
第10回Hadoopソースコードリーディングで発表した資料です。 2012年6月に開催されたHadoop Summit の参加レポートで、YARN、HBase、HDFS HA などのセッションを紹介しています。Read less
at_yasu 2012/06/26
mapreduce

hadoop
リンク
MapReduceできる10個のアルゴリズム - データサイエンティスト上がりのDX参謀・起業家
HadoopとMahoutにより、ビッグデータでも機械学習を行うことができます。Mahoutで実装されている手法は、全て分散処理できるアルゴリズムということになります。Mahoutで実装されているアルゴリズムは、ここに列挙されています。論文としても、2006年に「Map-Reduce for Machine Learning on Multicore」としていくつかのアルゴリズムが紹介されています。そこで今回は、（何番煎じか分かりませんが自分の理解のためにも）この論文で紹介されているアルゴリズムと、どうやって分散処理するのかを簡単にメモしておきたいと思います。計算するべき統計量が、summation form（足し算で表現できる形）になっているかどうかが、重要なポイントです。なってない場合は、”うまく”MapReduceの形にバラす必要があります。 ※例によって、間違いがあった場合は随時
at_yasu 2012/05/29
algorithm

MapReduce
リンク
Treasure Data, Inc. | Finding Gems in Your Big Data
Deliver The Experiences You Can’t TodayUnlike other customer data platforms (CDPs), only Treasure Data combines batch and real-time data to personalize journeys with Al. The results? Increased conversions and optimized spend across channels.
at_yasu 2012/05/15
ほほぉ

web

service

mapreduce
リンク
ビッグデータの価格破壊？　Googleが「処理量100GB/月まで無料」の解析サービスBigQueryを提供開始 - ITジャーナリスト星暁雄の"情報論"ノート
情報と技術は未来をどう変えるのか──IT、スマートデバイス、ロボット、電子工作、メディアのアーキテクチャ Google勤務のKazunori SatoさんがGoogle+に簡潔な解説をポストしてくれています。ポスト1 BigQueryが一般公開されました！数100億件の全検索が数十秒で完了する超並列クエリサービスで、MapReduceと並びGoogleの根幹を支える虎の子技術です。 Google BigQuery brings Big Data analytics to all businesses - Google Developers Blog ポスト2 BigQueryプチ解説：BigQueryはGoogle社内では「Dremel」と呼ばれる超並列クエリインフラを利用した一般向けサービスです。DremelはSybase IQやOracle Exadataと同様のColumar DB
at_yasu 2012/05/02
ほぉ

google

mapreduce
リンク
Cascading
Please note that all new project news and releases have moved to https://cascading.wensel.net The Cascading Ecosystem is a collection of applications, languages, and APIs for developing data-intensive applications. At the ecosystem core is Cascading, a Java API for defining complex data flows and integrating those flows with back-end systems, and a query planner for mapping and executing logical f
at_yasu 2012/04/10
ほほぉ

hadoop

mapreduce

programming
リンク
MapReduceのパターン、アルゴリズム、そしてユースケース - きしだのHatena
Ilya Katsov氏による「MapReduce Patterns, Algorithms, and Use Cases」の翻訳 http://highlyscala ble.wordpress.com/2012/02/01/mapreduce-patterns/ (下書きに入れて推敲するつもりが、なんか公開されてしまっていたので、あとでいろいろ修正すると思います) February 1, 2012 この記事では、Webや科学論文で見られる異なるテクニックの体系的な視点を与えるために、数々のMapReduceパターンとアルゴリズムをまとめた。いくつかの実用的なケーススタディも提供している。すべての説明とコードスニペットでは、Mapper、Reducer、Combiner、Partitionaer、ソーティングにおいてHadoopの標準的なMapReduceモデルを利用します。このフレー
at_yasu 2012/03/04
programming

mapreduce
リンク
liris.org - このウェブサイトは販売用です！ - liris リソースおよび情報
This webpage was generated by the domain owner using Sedo Domain Parking. Disclaimer: Sedo maintains no relationship with third party advertisers. Reference to any specific service or trade mark is not controlled by Sedo nor does it constitute or imply its association, endorsement or recommendation.
at_yasu 2012/01/06
hadoop

python

mapreduce
リンク
Introduction to Parallel Programming and MapReduce - Google Code University - Google Code
Introduction to Parallel Programming and MapReduce Table of Contents Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant programming experience with a language such as C++ or Java, and data structures & algorithms. Serial vs. Parallel Programming In the early days of computing, programs were s
at_yasu 2011/10/25
programming

mapreduce

google
リンク
Celery - Distributed Task Queue — Celery 3.0.11 documentation
Documentation has moved Celery is now using Read the Docs to host the documentation for the development version, where the pages are automatically updated as new changes are made (YAY!) The new location for this page is: http://docs.celeryproject.org/en/master/index.html. If you wanted documentation for the latest stable version instead, please go to: http://docs.celeryproject.org/en/latest/index.
at_yasu 2011/01/17
GAEのTaskQueueみたいなのを実行するPython module

python

mapreduce
リンク
優良企業はなぜHadoopに走るのか
ちなみに、この分析のために必要とされるMapReduceのコードであるが、そのサイズはわずか20ステップだという。Yahoo!のプレゼンテーターである、エリック・バルデシュバイラー氏によると、たとえ経験の浅いエンジニアであっても、MapReduceによるプログラミングは可能であるとされる。また、VISAのジョー・カニンガム氏からも、貴重なデータが提供されていたので以下に紹介する。同社では、1日に1億トランザクションが発生するため、2年間で700億強のトランザクションログが蓄積され、そのデータ量は36テラバイトに至るという。こうしたスケールのデータを、従来のRDBを用いて分析するには、約1カ月の時間が必要とされてきたが、Hadoopを用いることで13分に短縮されたという。これまでは、Yahoo!にしろVISAにしろ、膨大なデータをRDBに押し込むほかに方法はなく、その分析に数十日を要する
at_yasu 2009/10/16
んー？HadoopってGFSの代わりも兼ねてるの？？？

mapreduce

web
リンク
Amazon Elastic MapReduceを使ってみた - moratorium
Amazon Elastic MapReduceを使ってみた 2009-04-03 (Fri) 3:06 Amazon EC2 連日のEC2ネタです。本日、AmazonからElastic MapReduceというサービスがリリースされました。大規模データ処理技術が一気に民間の手に下りてくる、まさに革命的なサービスだと思います。 Amazon Elastic MapReduce Amazon ElasticMapReduce 紹介ビデオ With Hadoop, Amazon Adds A Web-Scale Data Processing Engine To Its Cloud Computer by techcrunch.com Elastic MapReduceは、Googleの基盤技術の一つであるMapReduceを時間単位課金で実行できるサービスです。MapReduceについては以
at_yasu 2009/04/03
amazon

google

mapreduce

web
リンク
Disco MapReduce
Disco is a lightweight, open-source framework for distributed computing based on the MapReduce paradigm. Disco is powerful and easy to use, thanks to Python. Disco distributes and replicates your data, and schedules your jobs efficiently. Disco even includes the tools you need to index billions of data points and query them in real-time. Disco was born in Nokia Research Center in 2008 to solve rea
at_yasu 2008/12/26
disco

python

erlang

mapreduce
リンク
Running Hadoop On Ubuntu Linux (Multi-Node Cluster) - Michael G. Noll
What we want to do In this tutorial, I will describe the required steps for setting up a multi-node Hadoop cluster using the Hadoop Distributed File System (HDFS) on Ubuntu Linux. Hadoop is a framework written in Java for running applications on large clusters of commodity hardware and incorporates features similar to those of the Google File System and of MapReduce. HDFS is a highly fault-tolera
at_yasu 2008/12/07
mapreduce

hadoop

linux
リンク
HadoopのMapRecudeをPython(Jython)で書くためのフレームワーク「Happy
Code Archive Skip to content Google About Google Privacy Terms
at_yasu 2008/11/10
python

mapreduce
リンク
Hadoop Python: Writing An Hadoop MapReduce Program In Python - Michael G. Noll
In this tutorial, I will describe how to write a simple MapReduce program for Hadoop in the Python programming language. Motivation Even though the Hadoop framework is written in Java, programs for Hadoop need not to be coded in Java but can also be developed in other languages like Python or C++ (the latter since version 0.14.1). However, the documentation and the most prominent Python example o
at_yasu 2008/08/27
mapreduce

python
リンク
Kansai.pm での発表資料 (Hadoop Streaming で MapReduce) - naoyaのはてなダイアリー
Kansai.pm に参加しました。とても楽しかったです。自分も "Hadoop Streaming で MapReduce" という題目で発表しました。取り急ぎ、資料を以下に公開します。 http://bloghackers.net/~naoya/ppt/080530kansai pm.ppt MapReduce は Google のバックエンドで動いている分散並列バッチ処理システムです。GFS は Google の分散ファイルシステムです。Google ウェアのクローンとしてオープンソースで開発されているのが Hadoop。Hadoop は Yahoo! Inc や Facebook, Amazon.com などでも利用されているとのこと。Hadoop は Java ですが、Hadoop Streaming を使うと　Java 以外でも MapReduce できます。以下のエントリも合
at_yasu 2008/06/11
非常に面白いです：）

google

mapreduce

資料
リンク
MapReduce - naoyaのはてなダイアリー
"MapReduce" は Google のバックエンドで利用されている並列計算システムです。検索エンジンのインデックス作成をはじめとする、大規模な入力データに対するバッチ処理を想定して作られたシステムです。 MapReduce の面白いところは、map() と reduce() という二つの関数の組み合わせを定義するだけで、大規模データに対する様々な計算問題を解決することができる点です。 MapReduce の計算モデル map() にはその計算問題のデータとしての key-value ペアが次々に渡ってきます。map() では key-value 値のペアを異なる複数の key-value ペアに変換します。reduce() には、map() で作った key-value ペアを同一の key で束ねたものが順番に渡ってきます。その key-values ペアを任意の形式に変換すること
at_yasu 2008/05/12
google

mapreduce

development

アルゴリズム
リンク
1

お知らせ

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

はてなブックマーク

タグ

関連タグで絞り込む (19)

mapreduceに関するat_yasuのブックマーク (20)

お知らせ

今週のはてなブックマーク数ランキング（2025年2月第4週）

今週のはてなブックマーク数ランキング（2025年2月第3週）

今週のはてなブックマーク数ランキング（2025年2月第2週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.