[B! impala] ikeikeikeikeのブックマーク

impalaに関するikeikeikeikeのブックマーク (14)

SSSSLIDE
ikeikeikeike 2014/07/18
impala

cloudera

hdfs
リンク
Compiled Python UDFs for Impala
2. Impala User-‐deﬁned FuncAons (UDFs) •  Tuple => Scalar value •  Substring •  sin, cos, pow, … •  Machine-‐learning models •  Supports Hive UDFs (Java) •  RelaAvely unpleasurable •  Slower •  Impala (naAve) UDFs •  C++ interface designed for eﬃciency •  Similar to Postgres UDFs •  Runs any LLVM-‐compiled code 2 4. LLVM: C++ example 4 bool StringEq(FunctionContext* context,! const StringVal& a
ikeikeikeike 2014/07/18
compiler

context

hive

hadoop

impala

python

impyla

const

c++
リンク
Blog | Cloudera
For the inaugural episode of Women Leaders in Techno logy on The AI Forecast, we welcomed Kari Briski – Vice President AI Software Product Management at NVIDIA. Kari shared the stories and strategies that inform her leadership style (like GSD or “getting stuff done”), what it means to trust your instinct, and the advice she gives to young women embarking on a career in techno logy and to women furth
ikeikeikeike 2014/07/17
impala.udf

data

disk

api

cloudera

developer

python

impala
リンク
Impala
Apache Impala is the open source, native analytic database for open data and table formats. Follow us on Twitter at @ApacheImpala! Do BI-style Queries Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Unify Your Infrast
ikeikeikeike 2014/04/15
はい

impala

hadoop
リンク
Parquet
Documentation Download Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides high performance compression and encoding schemes to handle complex data in bulk and is supported in many programming language and analytics tools.
ikeikeikeike 2014/04/09
data

compression

encoding

hadoop

impala

parquet

file

columnar storage

format

column oriented database
リンク
Impalaのmetadataの更新 | GrepGrape Blog
fluentdでHDFSへ書き込んだログをImpalaですぐに実行したい。Impala1.2.Xで，Impalad，Catalogd + StateStoreの構成で試験的に稼働させている。CatalogdとStateStoreの役割については，いつも参考にさせていただいているKernel023さんの参考リンクに詳しく説明されている。Catalogdサービスを動かしていればImpala経由で実行したDDLによる変更は，クラスタ内の他のImpaladにも反映されるようだ。なお，HueのImpala Query UIから『METASTOREカタログの更新』を実行するとInvalid method name: 'ResetCatalog' (code THRIFTAPPLICATION): Noneとエラーになる。クエリでinvalidate metadataと実行することは問題ない。現在，fl
ikeikeikeike 2014/04/01
impala

hdfs
リンク
https://docs.cloudera.com/documentation/enterprise/latest.html
ikeikeikeike 2014/03/13
大体はカタログサーバー追加してってエラーみたいね. > “ERROR: AnalysisException: This Impala daemon is not ready to accept user requests. Status: Waiting for catalog update from the StateStore.”

CDM

impala
リンク
Cloudera Impalaのアーキテクチャ
（本ブログは若干古くなっているので、Impala情報ページをご覧下さい。比較的新しい情報をまとめています）一人アドベントカレンダー２５日目、最終日です。最終日はCloudera Impala（以下Impala）について。Impalaは分散クエリエンジンです。最近EMRでも利用できるようになりました。 Hiveとは何が違うのか、なぜHiveを高速化しなかったのかというような意見もあるようですが、その答えはClouderaの創業者でもあるMike Olsonが今週公開したブログ（Impala v Hive）に詳しく書かれています。かなり興味深い内容ですが、今のところ英語のみです。きっと日本語の記事もいずれ読めるようになるはず。。。さて、最終日はCloudera Impalaのアーキテクチャについて書いてみます。引用している資料はSlideshareでClouderaが公開しているものです
ikeikeikeike 2014/03/03
impala

hive

cloudera

hadoop
リンク
Cloudera ImpalaとCatalog Serviceの話
Impalaのメタデータ ImpalaはHiveと共通のメタストアを使用しています。従来、メタデータが変更された場合には、Impalaで”invalidate metadata/refresh”コマンドを使用して変更を認識する必要がありました。Impala 1.2.Xから管理方法が変更となり、メタデータの変更を管理するサービス、Catalog Serviceが登場しています。 Catalog Serviceはメタデータの管理を行う中央型のサービスです。Catalog Serviceはメタデータの更新を処理し、クラスタの全てのImpaladノードに対してどのメタデータの変更が行われたのかをStateStore経由で送信します。このサービスにより、Impalaによって行われたメタデータの変更は”invalidate metadata”コマンドを実行することなく、全てのノードで自動的に認識され
ikeikeikeike 2014/03/03
impala
リンク
Hue - Hadoop User Experience - The Apache Hadoop UI — Tutorial: Executing Hive or Impala Queries with Python
Tutorial: Executing Hive or Impala Queries with Python This post talks about Hue, a UI for making Apache Hadoop easier to use. Hue uses a various set of interfaces for communicating with the Hadoop components. This post describes how Hue is implementing the Apache HiveServer2 Thrift API for executing Hive queries and listing tables. The same interface can also be used for talking to Cloudera Impal
ikeikeikeike 2013/12/12
hive

impala

hadoop

Apache

home

Python

Hue
リンク
Impala SQL Language Elements
The Impala SQL dialect supports a range of standard elements, plus some extensions for Big Data use cases related to data loading and data warehousing. Note: In early Impala beta releases, a semicolon was optional at the end of each statement in the impala-shell interpreter. Now that the impala-shell interpreter supports multi-line commands, making it easy to copy and paste code from script files,
ikeikeikeike 2013/11/04
正確

data

impala

cloudera

sql
リンク
ほぼやけくそHive Hacks – OpenGroove
Hive Hacksあれこれ。内容はほぼO’REILLY Hadoop Hacksからの引用そのまんま。ただの個人メモなのだが、ずうずうしく公開させてもらいます。いろんなところに記録しておいてもすぐに「あれ、あのメモどこやったっけ」となるのでここに書くのが一番なんだよね。書いたからって理解できるわけでもないんだが… （初めに書いておくと、この投稿長いです）基本原則的なこと。 ●UPDATEは回避する処理速度が遅延するため、UPDATEを多数含むようなSQLをHiveSQLに変換することは避けるべき ●MapReduceタスクのオーバーヘッド Hiveは「高スループットを目指す処理には向いているが、低レンテンシを目指す処理には向いていない」というMapReduce処理の特徴を引き継いでいる。MapReduceタスクのオーバーヘッドが付きまとうことを念頭におく。 ●並列分散ができない処理
ikeikeikeike 2013/10/22
hive

join

hadoop

impala
リンク
少しだけImpala演習 – OpenGroove
Impalaを、軽ーくさわってみた記録。せっかくなのでHiveとも比較してみた。実行環境はAWSのm1.largeインスタンスに構築したHadoop疑似分散モード。セットアップ方法は前回投稿に記載。サンプルデータをダウンロードする。マシンの適当な場所で以下実行。このリンクいつまであるかわからないけど、以前のHive演習用に使ったサンプルです。 $ wget http://image.gihyo.co.jp/assets/files/book/2012/978-4-7741-5389-6/download/sample.zip $ unzip sample.zip この中のtsvファイルを、HDFSにputする。これは相対パス指定。 $ hadoop fs -put /tmp/sales.tsv test/ Impala shellを起動して、テーブルを作成する。元データはファイル名では
ikeikeikeike 2013/09/04
impala

hive
リンク
Cloudera Impala発表資料 | 外道父の匠
11/26 の『Hadoopソースコードリーディング第13回』でCloudera Impalaの発表をしてきました。きっかけはTwitter上で、ビールの化身も◯すの外道父を呼べば？から始まって、１分かからず依頼ツィートが飛んできて引き受けた感じで、Twitterで数分で全てが完結する非常にフットワークの軽い業界になります。それでは、発表資料や補足などを書いていきます。リンク Eventbrite : Hadoopソースコードリーディング第13回 Twitter #hadoopreading togetter : Hadoopソースコードリーディング第13回まとめ Inside Impala Coordinator at HSCR 13th – Go ahead! by @repeatedly Inside Impala -Query Exec Engine- by @o
ikeikeikeike 2013/07/12
hive

hadoop

benchmark

impala

cloudera

performance

databases

外道父

search

負荷分散
リンク
1

お知らせ

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

はてなブックマーク

タグ

関連タグで絞り込む (34)

impalaに関するikeikeikeikeのブックマーク (14)

お知らせ

今週のはてなブックマーク数ランキング（2025年2月第4週）

今週のはてなブックマーク数ランキング（2025年2月第3週）

今週のはてなブックマーク数ランキング（2025年2月第2週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.