Information Technology Tips

投稿

ラベル（HTML）が付いた投稿を表示しています

Javascriptで動作するHTMLエディター

Web画面上でHTMLを編集するために、Javascript実装のHTMLエディタが必要になることがあると思います(例えばWordpressやBloggerのエディタのようなものです)。様々なエディタがありますが、 Code Mirror というエディタは高機能で便利です。下記のような機能があります。 100以上の言語をサポート強力なxmlタグ補完機能コード折り畳み機能 Vim、Emacsなどと同じキーバインディングのサポート検索・置換機能括弧、タグのマッチング機能たくさんのアドオン

Escape Html by Javascript

When I copy and paste some code snippet on this blog, I have to escape some special characters for example "<", ">", etc. I googled how to escape html special characters by javascript and found escaping-html-strings-with-jquery . Based on the answer, I created simple html escape tool below. copy & paste text in this text area, then html escaped text will be output text area below. The whole javascript code is below! <script src="//ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js"></script> <script> $(function(){ $("#src_html").keyup(function(){ $("#dest_html").val(escapeHtml($("#src_html").val())); }); var entityMap = { "&": "&", "<": "<", ">": ">", '"': '"', "'": ''', "/": '/' }; func...

Java: Extract img src from HTML

Here is an code snippet for extracting image src from html. private static final Pattern IMG_SRC_PATTERN = Pattern.compile("<img\\s+.*src\\s*=\\s*('|\")(.+?)\\1.+?>"); public static List<String> extractImgSrces(final String content) { List<String> list = new ArrayList<>(); final Matcher matcher = IMG_SRC_PATTERN.matcher(content); while(matcher.find()){ list.add(matcher.group(2)); } return list; } The example usage is below. In this example, you should only prepare HttpUtils.getStringContentsFromURL method, which is getting html from given url, for your self. public static void main(String[] args) throws URISyntaxException, IOException { extractImgSrces(HttpUtils.getStringContentsFromURL("http://www.google.com/", "utf-8")).stream().forEach(System.out::println); }

Restrict Html Tags which User Can Input (PHP)

I tried to implement very simple html edit text area which has available tags user can input are restricted. So I needed to implement a validator which detects tags not allowed to use. The proper (but a bit heavy) implementation approach is using Tidy . It can validate entire html and also fix and clean up html source! However in my case using tidy is a bit overkill solution. Instead of using tidy, I decided to use strip_tags function. The disadvantage is that the function does not validate html syntax. e.g. inaccurate than using tidy.- "strip_tags function does not actually validate the html, partial or broken tags can result in the removal of more text/data than expected." as the official PHP document says. Okie, as long as we understand the disadvantage, we can use this function. Let's show you the code. function validateOnlyAllowedTags($html, $tags) { $stripped = strip_tags($html, $tags); // if no tags are stripped, the length of html contents ...

Html Layout by JSP

Introduction In this post, I will introduce how to achieve html layout by jsp. You just define layout and put piece of elements in each actual jsp. I won't explain details but just show you an quick example for sharing "header & footer" in all jsp pages. If you are interested in how & why it works, please refer jsp documentation (or googling). Code Simply you need to prepare a layout tag file and actual jsp files. Prepare layout.tag (actually name is not so important) and put it under WEB-INF/tags. Put header.jsp and footer.jsp under WEB-INF/views/ Create actual jsp which include taglib you previously defined above step. Ok now I will show you the actual codes. layout.tag A key part is using fragment feature. <%@tag description="Layout template" pageEncoding="UTF-8"%> <%@attribute name="main" fragment="true" %> <%@attribute name="head" fragment="true" %> <%@attribu...

C#: Extract A Href Links from Html Text

// I know 1, 2, 3 is bad grouping name X( private readonly static Regex LINK_REGEX = new Regex( @"<a\s+[^>]*href\s*=\s*(?:(?<3>'|"")(?<1>[^\3>]*?)\3|(?<1>[^\s>]+))[^>]*>(?<2>.*?)</a>", RegexOptions.IgnoreCase | RegexOptions.Compiled ); public static void ExtractLinks(string text, ICollection<string> links) { LINK_REGEX.ApplyAllMatched(text, (m) => links.Add(m.Groups[1].Value)); } Helper method which processes matched string for each. public static void ApplyAllMatched(this Regex regex, string text, Action apply) { for (var m = regex.Match(text); m.Success; ) { apply(m); m = m.NextMatch(); } }

C#: Extract Charset from Html Meta Tag

private static readonly Regex META_TAG_CHARSET_REGEX = new Regex(@"<META\s+http-equiv\s*=\s*Content-Type\s+content=\s*""[^""]*\s+charset\s*=\s*(?<charset>[^""\s]*).*""\s*>", RegexOptions.IgnoreCase | RegexOptions.Compiled); public static string ExtractCharset(string htmlText) { string result = null; var m = META_TAG_CHARSET_REGEX.Match(htmlText); if (m.Success) { result = m.Groups["charset"].Value; } return result; } Note: System.Text.RegularExpressions.Regex class is thread safe according to this msdn doc .

C#: XPath and HtmlTextWriter Example

Hey guys!! This post show you how to use XPath in C#. I wrote program for formatting rss feed (xml) to html for its example. XPath in C# The simplest way is just creating XmlDocument and call SelectNodes method. var xmlString = "some xml string...." // create XmlDocment var doc = new XmlDocument(); doc.LoadXml(xmlString); // selecting nodes by xpath string doc.SelectNodes("/rss/channel/item"); RSS Xml to Html Example Okay here is a simple XPath example - converting rss xml to html. using System.IO; using System.Web.UI; using System.Web; namespace Utility { public class RSStoHtmlWriter : HTMLWriteHelper { private readonly string url; public RSStoHtmlWriter(string url) { this.url = url; } public override void WriteBody(HtmlTextWriter htmlWriter) { using (var reader = new XmlTextReader(url)) { // **** using XPath!! **** // As you n...

HTML related library for Java

HTML Parser JavaのHTML Parser でいまだにしっくりくるライブラリを見つけられないのですが、私がいくつか試したものを紹介します。 JTidy 特にXHTML形式のファイルの解析で威力を発揮します。 HTMLEditorKit : Swingに付属しているものです。個人的にはSwingのライブラリをHTMLの解析の目的で使うのはどうかなあと感じています。 NekoHTML 残念ながらまだ試していませんが、これが使いやすそうです。機会があればBlogに書こうと思います。 StackOveflow 　の Java HTML Parsing の議論が参考になりそうです。 Htm Parser jsoup HTML Validator JTidy

HTMLのMETAタグから文字エンコードを取得するコード

ExtractEncodingメソッドでRegexを使ってMETAタグからcharsetの部分を取り出しています。 Readメソッドでは、まず与えられたpathにあるhtmlファイルからExtractEncodingメソッドを使って適切な文字エンコードを取得します。その後取得した文字コードで読み取ったhtmlを変換しています。 using System; using System.Text; using System.Text.RegularExpressions; using System.IO; namespace Utility { public static class EncodingUtils { private static readonly Regex r = new Regex( @"<META\s+http-equiv\s*=\s*Content-Type\s+content=\s*""[^""]*\s+charset\s*=\s*(?<charset>[^""\s]*).*""\s*>", RegexOptions.IgnoreCase | RegexOptions.Compiled); public static string ExtractEncoding(string html) { string result = null; lock (r) { Match m = r.Match(html); if (m.Success) { result = m.Groups["charset"].Value; } } return result; } ...

Webサイトの高速化

2017年7月に内容を大幅に更新しました。 2017年現在では、Full SSL化やHTTP/2、PWAの技術が一般的になり、下記に紹介してきた書籍の内容は古くなっています。たとえば、HTTP/2のリソースの並行ダウンロード機能は非常に有用です。またdns-prefetch, preload, preconnectなど新しい機能が多く発表されてきています。恐らく書籍下記の書籍はもう古すぎるので、発表当時ほどのインパクトは低く、あまり読む必要はないかと思います。書籍書籍下記の書籍はもう古すぎるので、発表当時ほどのインパクトは低く、あまり読む必要はないかと思います。記録のため残しておきます。オライリーから出版されているハイパフォーマンス Webサイト原著のHigh Performance Web Sitesのページはこちら。原著者のSteve Soudersさんのホームページはこちら。また翻訳者のサポートページがあります。続編の続ハイパフォーマンス Webサイトツール PageSpeed Insights 有名過ぎるので、いまさら紹介するまでもありませんが、Google先生のは是非試してみてください。CSSスプライトを利用しなさい、Expiryをヘッダーに付加しなさい、gzip圧縮を有効にしなさいなど、色々なアドバイスを出してくれます。 WebPagetest 実際のページを表示する際に必要な画像をダウンロードするのにかかる時間を細かく測定し、Water fall chartやビデオなどでデータを出力してくれる非常に優れたツールです。 PageSpeed Insightsはスコアという形で表示されますが、実際のユーザの体感を調べるのはこちらのツールの方が優れていると思います。