[B! 正規表現] lockcoleのブックマーク

技術メモ帳 - 単語抽出するちょっとしたテクニック

shell のちょっとしたテクニック - odz buffer odzさんのところで単語の出現頻度を調べるためのワンライナーが紹介されている。単語抽出についてはいくつかやり方があるので紹介する。特に egrep -o は非常に実用的。もちろん、この話はリンク先において主題ではない。俺のただの自己満足。 tr -cs で単語抽出 $ tr -cs 'a-zA-Z' '\n' < ./.zshrc 上記コマンドで、a-zA-Zのみで構成される文字列を抽出できる。 -c オプションで、'a-zA-Z' の補集合(以外)を '\n' に変換し、 -s オプションで、最後の引数で指定される要素の重複を除去している。すなわち '\n' の連続を除去している。 grep -o で単語抽出 grepには、( -o、--only-matching ) オプションというのがある。これは、指定された正

lockcole 2007/02/20

「egrep -o '[0-9]+(\.[0-9]+){3}' < /var/log/system.log | sort | uniq -c | sort -nr | head」に惚れた。

正規表現

リンク

reAnimator: Regular Expression FSA Visualizer

This application requires the Flash plugin. If the plugin is already installed, click here. Warning: include(../../includes/footer.php) [function.include]: failed to open stream: No such file or directory in /home/.sidestep/osteele/osteele.com/tools/reanimator/index.php on line 32 Warning: include() [function.include]: Failed opening '../../includes/footer.php' for inclusion (include_path='.:/usr

lockcole 2006/03/02

正規表現を有限オートマトンの図で可視化。学術的なコンパイラ・正規文法（正則文法）やらで見る図になってしまう。これ見るとすごい実感！ぜひ授業でも取り入れてほしいな。分かりやすさ3x！

リンク

Regular Expression Library

Welcome to RegExLib.com, the Internet's first Regular Expression Library. Currently we have indexed 4149 expressions from 2818 contributors around the world. We hope you'll find this site useful and come back whenever you need help writing an expression, you're looking for an expression for a particular task, or are ready to contribute new expressions you’ve just figured out. Thanks! Add Regex

lockcole 2006/02/03

ユーザ投稿型の正規表現ライブラリ。

リンク

AJAX Reguler EXpression eValuator

EnglishBulgarian

lockcole 2006/01/19

PCRE,POSIX,Javascriptの正規表現を試すことが出来るツール。実行結果もみやすくていいかんじ。ローカル環境でもこれだけひとまとめにしたものって無いから重宝しそう。

リンク

すべての漢字を取り出す正規表現 - totonのブログ

http://www.unixuser.org/~euske/doc/python/sample.py.html # 日本語トークンを切り出すための正規表現。 JP_TOKEN = re.compile(u"[一-龠]+|[ぁ-ん]+|[ァ-ヴ]+|[a-zA-Z0-9]+") http://www.ascii.co.jp/pb/ascii/archive/aftercare/1999.html ［亜-煕］はJIS漢字を検出するときに使う正規表現になります。本文中で触れている「一太郎 Lite2」の正規表現はUnicode仕様なので，すべての漢字を検出するには，[一-龠]を使用してください（龠は音読みで「ヤク」，訓読みで「ふえ」，Unicodeでは「9FA0」にあたります）。追記 ※「すべての漢字を取り出す正規表現」については、id:toton:20051105 に記事を追加しました。

lockcole 2005/11/05

ちょうど使いたい場面があった。Great!UTF-8用

リンク

はてなブックマーク

タグ

関連タグで絞り込む (7)

正規表現に関するlockcoleのブックマーク (5)

お知らせ

今週のはてなブックマーク数ランキング（2025年2月第4週）

今週のはてなブックマーク数ランキング（2025年2月第3週）

今週のはてなブックマーク数ランキング（2025年2月第2週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.