Generation of Web Pages from Document Image

Call for Paper

May Edition

IJAIS solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 28 April 2026

Submit your paper

Know more

The week's pick

Optimized Decision Tree Classifier for Data Aggregation in Wireless Sensor Networks Using IoT Sensor Data

Jagan Kurma Raghuvaran Kendyala Varun Bitkuri Avinash Attipalli Jaya Vardhani Mamidala Sunil Jacob Enokkaren

Random Articles

Cognitive Interference Management for Autonomic Femtocell Networks

May

2012

Fundamental Structure of Linux Kernel based Device Driver and Implementation on Linux Host Machine

January

2016

Performance Overhead on Relational Join in Hadoop using Hive/Pig/Streaming - A Comparative Analysis

December

2012

Autism Spectrum Disorder: Review of Datasets, Computational Models, and Future Research Directions

Sep

2025

Reseach Article

Generation of Web Pages from Document Image

Published on June 2014 by Aparna Halbe, Abhijit R. Joshi

International Conference and workshop on Advanced Computing 2014

Foundation of Computer Science USA

ICWAC2014 - Number 2

June 2014

Authors: Aparna Halbe, Abhijit R. Joshi

Aparna Halbe, Abhijit R. Joshi . Generation of Web Pages from Document Image. International Conference and workshop on Advanced Computing 2014. ICWAC2014, 2 (June 2014), 0-0.

@article{

author = { Aparna Halbe, Abhijit R. Joshi },

title = { Generation of Web Pages from Document Image },

journal = { International Conference and workshop on Advanced Computing 2014 },

issue_date = { June 2014 },

volume = { ICWAC2014 },

number = { 2 },

month = { June },

year = { 2014 },

issn = 2249-0868,

pages = { 0-0 },

numpages = 1,

url = { /proceedings/icwac2014/number2/649-1434/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 International Conference and workshop on Advanced Computing 2014

%A Aparna Halbe

%A Abhijit R. Joshi

%T Generation of Web Pages from Document Image

%J International Conference and workshop on Advanced Computing 2014

%@ 2249-0868

%V ICWAC2014

%N 2

%P 0-0

%D 2014

%I International Journal of Applied Information Systems

Abstract

The development of any project in software industry begins with Requirement specification followed by User Interface [UI] design. Normally UI design is drawn on paper first. Web designers then design the web pages as per the design on the paper. Various Mark Up languages such as HTML/XML are used to design and publish web pages on the internet. In this paper a novel approach is proposed that will do the job of web designer. This system will convert the UI design drawn on paper to HTML page. A scanned image of UI design will be provided as an input to the system and it generates the output which will be a HTML page of that UI. To do this, system requires the conversion of paper document image into hyper documents. Currently, the work done in this area is restricted only to the conversion of images and text into hyper document. Here, an idea of converting document image of UI design into actual HTML page, is proposed. Also work done so far in this area is restricted only to the text and images on documents. It does not consider various HTML controls like textbox, radio button, checkboxes, button etc. Therefore, existing system just converts the paper document into hypertext document and does not identify each HTML control as a separate component, which is ( a primary requirement ) required while designing UI. Given a UI design with different HTML controls, the existing system would just convert it to HTML page without providing any functionality. The generated HTML page will have an image of the UI design rather than actual HTML controls. The proposed work is addressing all these issues and will be considering most of the HTML controls those are required for designing static pages.

References

Hassan, T. , Baumgartner, R. "Table Recognition and Understanding from PDF Files" International Conference on Document Analysis and Recognition (ICDAR 2007), Curitiba, Brazil(2007)1143-1147
Jiang, D. , Yang, X "Converting PDF to HTML Approach Based on Text Detection" 2ndInternational Conference on Interaction Sciences: Information Technology, Culture and Human. ACM New York, NY, USA, Seoul, Korea (2009)
Ji-Yeon Lee, Jeong-Seon Park, HyeranByun, JongsubMoon, Seong-Whan Lee, Pattern Recognition Society. Elsevier Science Ltd, December 2001
Klink, S. , Dengel, A. , Kieninger, T. "Document Structure Analysis Based on Layout and Textual Features" International Workshop on DocumentAnalysis Systems, Rio de Janeiro, Brasil (2000)41-52.
Leo G Vailati, "block diagram detection", EECS 741 - Computer Vision, EECS - Dept. of Electrical Eng. & Computer Science KU - The University of Kansas (2012)
Oro, E. , Ruffolo, M. : PDF-TREX "An Approachfor Recognizing and Extracting Tables from PDF Documents" 10th International Conference on Document Analysis and Recognition 2009. IEEE ComputerSociety,Barcelon
Priyadharshini N1, Vijaya MS "Document Segmentation and Region Classification Using Multilayer Perceptron", IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 2, No 1, March 2014
Rosmayati Mohemad, Abdul RazakHamdan, Zulaiha Ali Othman and Noor MaizuraMohamad Noor, "Automatic Document Structure Analysis of Structured PDF Files", International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(2): 404-411, The Society of Digital Information and Wireless Communications, 2011
Sneha Sharma, Dr. Roxanne Canosa, advisor "Extraction of Text Regions in Natural Images" Rochester Institute of Technology, 2007
Yildiz, B. , Kaiser, K. , Miksch, S. "pdf2table: A Method to Extract Table Information from PDF Files" Indian International Conference on Artificial Intelligence, India (2005) 1773–178512
Hassan, T. , Baumgartner, R. "Table Recognition and Understanding from PDF Files" International Conference on Document Analysis and Recognition (ICDAR 2007), Curitiba, Brazil(2007)1143-1147
Jiang, D. , Yang, X "Converting PDF to HTML Approach Based on Text Detection" 2ndInternational Conference on Interaction Sciences: Information Technology, Culture and Human. ACM New York, NY, USA, Seoul, Korea (2009)
Ji-Yeon Lee, Jeong-Seon Park, HyeranByun, JongsubMoon, Seong-Whan Lee, Pattern Recognition Society. Elsevier Science Ltd, December 2001
Klink, S. , Dengel, A. , Kieninger, T. "Document Structure Analysis Based on Layout and Textual Features" International Workshop on DocumentAnalysis Systems, Rio de Janeiro, Brasil (2000)41-52.
Leo G Vailati, "block diagram detection", EECS 741 - Computer Vision, EECS - Dept. of Electrical Eng. & Computer Science KU - The University of Kansas (2012)
Oro, E. , Ruffolo, M. : PDF-TREX "An Approachfor Recognizing and Extracting Tables from PDF Documents" 10th International Conference on Document Analysis and Recognition 2009. IEEE ComputerSociety,Barcelon
Priyadharshini N1, Vijaya MS "Document Segmentation and Region Classification Using Multilayer Perceptron", IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 2, No 1, March 2014
Rosmayati Mohemad, Abdul RazakHamdan, Zulaiha Ali Othman and Noor MaizuraMohamad Noor, "Automatic Document Structure Analysis of Structured PDF Files", International Journal on New Computer Architectures and Their Applications (IJNCAA) 1(2): 404-411, The Society of Digital Information and Wireless Communications, 2011
Sneha Sharma, Dr. Roxanne Canosa, advisor "Extraction of Text Regions in Natural Images" Rochester Institute of Technology, 2007
Yildiz, B. , Kaiser, K. , Miksch, S. "pdf2table: A Method to Extract Table Information from PDF Files" Indian International Conference on Artificial Intelligence, India (2005) 1773–178512

Index Terms

Computer Science

Information Sciences

Keywords

Image Processing for GUI rapid web development GUI design processing document images automatic web page generation