Page suivantePage pr�c�denteTable des mati�res

18. Filtrage Minitel

Le r�sultat est pollu� par les indications de navigation du minitel et les changements de page.

Le script suivant est utile pour filtrer les pages ascii g�n�r�es par xtel. Il �limine les sauts de page et toutes les informations de navigation des pages minitel pour ne garder que les informations pertinentes (nom, adresse et t�l�phone). Avec quelques traitements suppl�mentaires il pourrait ais�ment alimenter une base de donn�es.


#!/usr/bin/perl -w
#
#    minitel-filter.pl
#
#    Copyright (C) 1999, Gilles Lamiral
#
#    This program is free software; you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation; either version 2 of the License, or
#    (at your option) any later version.
#
#    You should have received a copy of the GNU General Public License
#    along with this program; if not, write to the Free Software
#    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#
# usage:     minitel-filter.pl  input-file
#        or  cat input-file | minitel-filter.pl
<>;
@file = <>;
$file = join("",@file);
@page = split(/                  page suivante  Suite  \nrevoir la liste de specialites\*  Retour \n   infos sur ordre des reponses \* Envoi \n/,$file);
foreach $page (@page) {
 @infos = split(/````````````````````````````````````````\n/,$page);
 @enterprise = split(/    ````````````````````````````````````\n/,$infos[1]);
 foreach $enterprise (@enterprise) {
 print   "=========================================\n",
 "Etat:\n",
 "-----------------------------------------\n";
 @phoneInfos = split(/                                        \n/, $enterprise);
 foreach $phoneInfos (@phoneInfos) {
 print $phoneInfos;
 }
 }
}


Page suivantePage pr�c�denteTable des mati�res

Hosting by: Hurra Communications GmbH
Generated: 2007-01-26 18:01:30