initial commit

This commit is contained in:
Tobias Diekershoff 2015-01-04 21:46:33 +01:00
commit 8eeedc3a3c
54 changed files with 179682 additions and 0 deletions

50
README.md Normal file
View File

@ -0,0 +1,50 @@
Typography Addon
================
This addpn uses the [php typography](http://kingdesk.com/projects/php-typography/)
library by KINGdesk to enhance the typography of postings in friendica.
The addon uses the language detection capabilities of friendica to select the
approproate typographic set or rules.
ToDo
----
* write some CSS to enhance the typography of stuff like ALL CAPS etc. a bit
further, now that they are identified and marked.
* There was one thing in the library that I had to comment out to get rid of
many warnings (see php-typography.php lines 1964 and following) which should
be fixed in a better way.s
History
-------
* 2014-12-29: Version 0.1, initial release
Author
------
* [Tobias Diekershoff](https://f.diekershoff.de/profile/tobias)
License
-------
This addon is licensed under the terms of the [GNU GPL 2.0](https://www.gnu.org/licenses/gpl-2.0.html)
as the underlying library uses this license.
Copyright (C) 2014 Tobias Diekershoff
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.

View File

@ -0,0 +1,169 @@
1.21 - December 31, 2009
Fixed bug in custom diacritic handling
1.20 - December 20, 2009
Resolved uninitiated variable
Added HTML5 elements to parsing algorithm for greater contextual awareness
Updated to PHP Parser 1.20
1.19 - December 1, 2009
Fixed bug where dewidow functionality would add broken no-break spaces to the end of texts, and smart_exponents would drop some of the resulting text.
Declared encoding in all instances of mb_substr to avoid conflicts
Corrected a few instances of undeclared variables.
Updated to PHP Parser 1.19
1.18 - November 10, 2009
Added Norwegian Hyphenation Patterns
1.17 - November 9, 2009
Fixed bug in diacritic handling
1.16 - November 4, 2009
Added US English list of all words containing diacritics to `/diacritics/en-US.php`
Added get_diacritic_languages() method
Added set_smart_diacritics() method
Added set_diacritic_language() method
Added set_diacritic_custom_replacements() method
Added smart_diacritics() method
Improved smart quotes and dashes to be sensitive to adjacent diacritic characters.
1.15 - October 21, 2009
Depreciated set_smart_quotes_language()
Added set_smart_quotes_primary()
Added set_smart_quotes_secondary()
1.14 - September 8, 2009
Improved space_collapse method
Corrected bug in smart quote and single character word handling where the "0" character may be improperly duplicated
1.13 - August 31, 2009
Added set_space_collapse and space_collapse methods
1.12 - August 17, 2009
Corrected multibyte character error that caused set_single_character_word_spacing() to drop words under rare circumstances
1.11 - August 14, 2009
Added language specific quote handling (for single quotes, not just double) for English, German and French quotation styles
1.10 - August 14, 2009
Added set_smart_quotes_language() for unique handling of English, German and French quotation styles
Corrected multibyte character error that caused set_single_character_word_spacing() to drop words under rare circumstances
Expanded the multibyte character set recognized as valid word characters for improved hyphenation
Updated to PHP Parser 1.10
1.9 - August 12, 2009
Added option to force single character words to wrap to new line (unless they are widows).
Fixed bug where hyphenation pattern settings were not initialized with multiple phpTypography class instances.
1.8 - August 4, 2009
Fixed date handling in smart_math() and smart_dashes() methods
Fixed style_caps() method to be friendly with soft-hyphens
1.7 - July 28, 2009
Reformatted language files with line returns after each key=>value pair in an array
1.6 - July 28, 2009
Efficiency Optimizations ( approximately 25% speed increase ) Thanks Jenny!
1.5 - July 27, 2009
Added the set_hyphenate_title_case() method to exclude hyphenation of capitalized (title case) words to help protect proper nouns
Added Hungarian Hyphenation Pattern
1.4 - July 23, 2009
Updated to PHP Parser 1.4 (fixed a hyphenation problem where pre-hyphenated words were processed again)
1.3 - July 23, 2009
Uninitialized variables corrected throughout.
Use of 2 instances of create_function() eliminated for performance gain
Cleaned up HTML character handling in process_feed(). No errors were identified prior to edit, but now it is consistent with how process() works.
1.2 - July 23, 2009
moved the processing of widow handling after hyphenation so that max-pull would not be compared to the length of the adjacent word, but rather the length of the adjacent word segment (i.e. that after a soft hyphen)
1.1 - July 22, 2009
By default, when class phpTypography is constructed, set_defaults is called. However, if you are going to manually set all settings, you can now bypass the set_defaults call for slightly improved performance. Just call `$typo = new phpTypography(FALSE)`
Added `html_entity_decode` to process_feed to avoid invalid character injection (according to XML's specs)
1.0.3 - July 17, 2009 =
Reverted use of the hyphen character to the basic minus-hyphen in words like "mother-in-law" because of poor support in IE6
1.0.2 - July 16, 2009
Corrected smart_math to not convert slashes in URLs to division signs
1.0 - July 15, 2009
Added test to phpTypography methods process() and process_feed() to skip processing if $isTitle parameter is TRUE and h1 or h2 is an excluded HTML tag
1.0 beta 9 - July 14, 2009
added catch-all quote handling, now any quotes that escape previous filters will be assumed to be closing quotes
1.0 beta 8 - July 13, 2009
Changed thin space injection behavior so that for text such as "...often-always?-judging...", the second dash will be wrapped in thin spaces
Corrected error where fractions were not being styled because of a zero-space insertion with the wrap hard hyphens functionality
Added default class to exclude: "noTypo"
1.0 beta 7 - July 10, 2009
added "/" as a valid word character so we could capture "this/that" as a word for processing (similar to "mother-in-law")
Corrected error where characters from the Latin 1 Supplement Block were not recognized as word characters
Corrected smart quote handling for strings of numbers
Added smart guillemet conversion as part of smart quotes: << and >> to « and »
Added smart Single Low 9 Quote conversion as part of smart quotes: comma followed by non-space becomes Single Low 9 Quote
Added Single Low 9 Quote, Double Low 9 Quote and » to style_initial_character functionality
Added a new phpTypography method smart_math that assigns proper characters to minus, multiplication and division characters
Depreciated the phpTypography method smart_multiplication in favor of smart_math
Cleaned up some smart quote functionality
Added ability to wrap after "/" if set_wrap_hard_hyphen is TRUE (like "this/that")
1.0 beta 6 - July 9, 2009
Critical bug fix: RSS feeds were being disabled by previous versions. This has been corrected.
1.0 beta 5 - July 8, 2009
corrected error where requiring Em/En dash thin spacing "word-" would become "word " instead of "word"
1.0 beta 4 - July 7, 2009
Added default encoding value to smart_quote handling to avoid PHP warning messages
1.0 beta 3 - July 6, 2009
corrected curling quotes at the end of block level elements
1.0 beta 2 - July 6, 2009
corrected multibyte character conflict in smart-quote handling that caused infrequent dropping of text
thin space injection included for en-dashes
1.0 beta 1 - July 3, 2009
initial release

View File

@ -0,0 +1,425 @@
<?php
/**
* Language: English (United States)
*
* An array of all words containing diacritics (and their non-diacritic
* alternatives that should be replaced), provided a legitimate English
* word does not exist without such diacritic characters (i.e.
* divorcé & divorce, exposé & expose, résumé & resume ).
*
* In the form of $diacriticWords = array( key => value );
* where "key" is the needle and "value" is the replacement
**/
$diacriticLanguage = 'English (United States)';
$diacriticWords = array(
"a bas"=>"à bas",
"A bas"=>"À bas",
"a la"=>"à la",
"A la"=>"À la",
"a la carte"=>"à la carte",
"A la carte"=>"À la carte",
"a la mode"=>"à la mode",
"A la mode"=>"À la mode",
"a gogo"=>"à gogo",
"A gogo"=>"À gogo",
"ago-go"=>"àgo-go",
"Ago-go"=>"Àgo-go",
"abbe"=>"abbé",
"Abbe"=>"Abbé",
"adios"=>"adiós",
"Adios"=>"Adiós",
"agrement"=>"agrément",
"Agrement"=>"Agrément",
"anime"=>"animé",
"Anime"=>"Animé",
"Ancien Regime"=>"Ancien Régime",
"angstrom"=>"ångström",
"Angstrom"=>"Ångström",
"anu"=>"añu",
"Anu"=>"Añu",
"ao dai"=>"áo dài",
"Ao dai"=>"Áo dài",
"aperitif"=>"apéritif",
"Aperitif"=>"Apéritif",
"applique"=>"appliqué",
"Applique"=>"Appliqué",
"apres-ski"=>"après-ski",
"Apres-ski"=>"Après-ski",
"arete"=>"arête",
"Arete"=>"Arête",
"attache"=>"attaché",
"Attache"=>"Attaché",
"auto-da-fe"=>"auto-da-fé",
"Auto-da-fe"=>"Auto-da-fé",
"acaí"=>"açaí",
"Acaí"=>"Açaí",
"belle epoque"=>"belle époque",
"Belle epoque"=>"Belle époque",
"bete noire"=>"bête noire",
"Bete noire"=>"Bête noire",
"betise"=>"bêtise",
"Betise"=>"Bêtise",
"blase"=>"blasé",
"Blase"=>"Blasé",
"boite"=>"boîte",
"Boite"=>"Boîte",
"Bon"=>"Bön",
"Bootes"=>"Boötes",
"boutonniere"=>"boutonnière",
"Boutonniere"=>"Boutonnière",
"bric-a-brac"=>"bric-à-brac",
"Bric-a-brac"=>"Bric-à-brac",
"cafe"=>"café",
"Cafe"=>"Café",
"canape"=>"canapé",
"Canape"=>"Canapé",
"Champs-Elysees"=>"Champs-Élysées",
"chateau"=>"château",
"Chateau"=>"Château",
"charge d'affaires"=>"chargé d'affaires",
"Charge d'affaires"=>"Chargé d'affaires",
"cause celebre"=>"cause célèbre",
"Cause celebre"=>"Cause célèbre",
"chaines"=>"chaînés",
"Chaines"=>"Chaînés",
"cinema verite"=>"cinéma vérité",
"Cinema verite"=>"Cinéma vérité",
"cliche"=>"cliché",
"Cliche"=>"Cliché",
"cloisonne"=>"cloisonné",
"Cloisonne"=>"Cloisonné",
"consomme"=>"consommé",
"Consomme"=>"Consommé",
"communique"=>"communiqué",
"Communique"=>"Communiqué",
"confrere"=>"confrère",
"Confrere"=>"Confrère",
"coopt"=>"coöpt",
"Coopt"=>"Coöpt",
"cortege"=>"cortège",
"Cortege"=>"Cortège",
"coup d'etat"=>"coup d'état",
"Coup d'etat"=>"Coup d'état",
"coup de grace"=>"coup de grâce",
"Coup de grace"=>"Coup de grâce",
"creche"=>"crèche",
"Creche"=>"Crèche",
"coulee"=>"coulée",
"Coulee"=>"Coulée",
"creme brulee"=>"crème brûlée",
"Creme brulee"=>"Crème brûlée",
"creme fraiche"=>"crème fraîche",
"Creme fraiche"=>"Crème fraîche",
"creme"=>"crème",
"Creme"=>"Crème",
"crepe"=>"crêpe",
"Crepe"=>"Crêpe",
"Creusa"=>"Creüsa",
"crouton"=>"croûton",
"Crouton"=>"Croûton",
"crudites"=>"crudités",
"Crudites"=>"Crudités",
"Curacao"=>"Curaçao",
"dais"=>"daïs",
"Dais"=>"Daïs",
"dau hoi"=>"dấu hỏi",
"Dau hoi"=>"Dấu hỏi",
"debutante"=>"débutante",
"Debutante"=>"Débutante",
"declasse"=>"déclassé",
"Declasse"=>"Déclassé",
"decolletage"=>"décolletage",
"Decolletage"=>"Décolletage",
"decollete"=>"décolleté",
"Decollete"=>"Décolleté",
"decor"=>"décor",
"Decor"=>"Décor",
"decoupage"=>"découpage",
"Decoupage"=>"Découpage",
"degage"=>"dégagé",
"Degage"=>"Dégagé",
"deja vu"=>"déjà vu",
"Deja vu"=>"Déjà vu",
"demode"=>"démodé",
"Demode"=>"Démodé",
"denouement"=>"dénouement",
"Denouement"=>"Dénouement",
"derailleur"=>"dérailleur",
"Derailleur"=>"Dérailleur",
"derriere"=>"derrière",
"Derriere"=>"Derrière",
"deshabille"=>"déshabillé",
"Deshabille"=>"Déshabillé",
"detente"=>"détente",
"Detente"=>"Détente",
"diamante"=>"diamanté",
"Diamante"=>"Diamanté",
"discotheque"=>"discothèque",
"Discotheque"=>"Discothèque",
"doppelganger"=>"doppelgänger",
"Doppelganger"=>"Doppelgänger",
"eclair"=>"éclair",
"Eclair"=>"Éclair",
"eclat"=>"éclat",
"Eclat"=>"Éclat",
"Eire"=>"Éire",
"El Nino"=>"El Niño",
"elan"=>"élan",
"Elan"=>"Élan",
"emigre"=>"émigré",
"Emigre"=>"Émigré",
"entree"=>"entrée",
"Entree"=>"Entrée",
"entrepot"=>"entrepôt",
"Entrepot"=>"Entrepôt",
"entrecote"=>"entrecôte",
"Entrecote"=>"Entrecôte",
"epee"=>"épée",
"Epee"=>"Épée",
"etouffee"=>"étouffée",
"Etouffee"=>"Étouffée",
"etude"=>"étude",
"Etude"=>"Étude",
"facade"=>"façade",
"Facade"=>"Façade",
"fete"=>"fête",
"Fete"=>"Fête",
"faience"=>"faïence",
"Faience"=>"Faïence",
"fiance"=>"fiancé",
"Fiance"=>"Fiancé",
"fiancee"=>"fiancée",
"Fiancee"=>"Fiancée",
"filmjolk"=>"filmjölk",
"Filmjolk"=>"Filmjölk",
"fin de siecle"=>"fin de siècle",
"Fin de siecle"=>"Fin de siècle",
"flambe"=>"flambé",
"Flambe"=>"Flambé",
"fleche"=>"flèche",
"Fleche"=>"Flèche",
"fohn wind"=>"föhn wind",
"Fohn wind"=>"Föhn wind",
"folie a deux"=>"folie à deux",
"Folie a deux"=>"Folie à deux",
"fouette"=>"fouetté",
"Fouette"=>"Fouetté",
"frappe"=>"frappé",
"Frappe"=>"Frappé",
"fraulein"=>"fräulein",
"Fraulein"=>"Fräulein",
"Fuhrer"=>"Führer",
"garcon"=>"garçon",
"Garcon"=>"Garçon",
"gateau"=>"gâteau",
"Gateau"=>"Gâteau",
"gemutlichkeit"=>"gemütlichkeit",
"Gemutlichkeit"=>"Gemütlichkeit",
"glace"=>"glacé",
"Glace"=>"Glacé",
"glogg"=>"glögg",
"Glogg"=>"Glögg",
"Gewurztraminer"=>"Gewürztraminer",
"Gotterdammerung"=>"Götterdämmerung",
"Grafenberg spot"=>"Gräfenberg spot",
"gruyere"=>"gruyère",
"Gruyere"=>"Gruyère",
"habitue"=>"habitué",
"Habitue"=>"Habitué",
"hacek"=>"háček",
"Hacek"=>"Háček",
"hors doeuvre"=>"hors dœuvre",
"Hors doeuvre"=>"Hors dœuvre",
"ingenue"=>"ingénue",
"Ingenue"=>"Ingénue",
"jager"=>"jäger",
"Jager"=>"Jäger",
"jalapeno"=>"jalapeño",
"Jalapeno"=>"Jalapeño",
"jardiniere"=>"jardinière",
"Jardiniere"=>"Jardinière",
"krouzek"=>"kroužek",
"Krouzek"=>"Kroužek",
"kummel"=>"kümmel",
"Kummel"=>"Kümmel",
"kaldolmar"=>"kåldolmar",
"Kaldolmar"=>"Kåldolmar",
"karaoke"=>"karaōke",
"Karaoke"=>"Karaōke",
"landler"=>"ländler",
"Landler"=>"Ländler",
"langue d'oil"=>"langue d'oïl",
"Langue d'oil"=>"Langue d'oïl",
"La Nina"=>"La Niña",
"litterateur"=>"littérateur",
"Litterateur"=>"Littérateur",
"lycee"=>"lycée",
"Lycee"=>"Lycée",
"macedoine"=>"macédoine",
"Macedoine"=>"Macédoine",
"macrame"=>"macramé",
"Macrame"=>"Macramé",
"maitre d'hotel"=>"maître d'hôtel",
"Maitre d'hotel"=>"Maître d'hôtel",
"malaguena"=>"malagueña",
"Malaguena"=>"Malagueña",
"manana"=>"mañana",
"Manana"=>"Mañana",
"manege"=>"manège",
"Manege"=>"Manège",
"manoeuvre"=>"manœuvre",
"Manoeuvre"=>"Manœuvre",
"manque"=>"manqué",
"Manque"=>"Manqué",
"materiel"=>"matériel",
"Materiel"=>"Matériel",
"matinee"=>"matinée",
"Matinee"=>"Matinée",
"melange"=>"mélange",
"Melange"=>"Mélange",
"melee"=>"mêlée",
"Melee"=>"Mêlée",
"menage a trois"=>"ménage à trois",
"Menage a trois"=>"Ménage à trois",
"mesalliance"=>"mésalliance",
"Mesalliance"=>"Mésalliance",
"metier"=>"métier",
"Metier"=>"Métier",
"Metis"=>"Métis",
"minaudiere"=>"minaudière",
"Minaudiere"=>"Minaudière",
"moire"=>"moiré",
"Moire"=>"Moiré",
"Montreal"=>"Montréal",
"naif"=>"naïf",
"Naif"=>"Naïf",
"naive"=>"naïve",
"Naive"=>"Naïve",
"naivete"=>"naïveté",
"Naivete"=>"Naïveté",
"ne"=>"",
"Ne"=>"",
"nee"=>"née",
"Nee"=>"Née",
"negligee"=>"négligée",
"Negligee"=>"Négligée",
"Neufchatel"=>"Neufchâtel",
"Nez Perce"=>"Nez Percé",
"Noel"=>"Noël",
"numero uno"=>"número uno",
"Numero uno"=>"Número uno",
"Montano"=>"Montaño",
"objet trouve"=>"objet trouvé",
"Objet trouve"=>"Objet trouvé",
"ole"=>"olé",
"Ole"=>"Olé",
"ombre"=>"ombré",
"Ombre"=>"Ombré",
"omerta"=>"omertà",
"Omerta"=>"Omertà",
"opera bouffe"=>"opéra bouffe",
"Opera bouffe"=>"Opéra bouffe",
"opera comique"=>"opéra comique",
"Opera comique"=>"Opéra comique",
"outre"=>"outré",
"Outre"=>"Outré",
"papier-mache"=>"papier-mâché",
"Papier-mache"=>"Papier-mâché",
"passe"=>"passé",
"Passe"=>"Passé",
"pate"=>"pâté",
"Pate"=>"Pâté",
"pho"=>"phở",
"Pho"=>"Phở",
"piece de resistance"=>"pièce de résistance",
"Piece de resistance"=>"Pièce de résistance",
"pied-a-terre"=>"pied-à-terre",
"Pied-a-terre"=>"Pied-à-terre",
"plisse"=>"plissé",
"Plisse"=>"Plissé",
"pina colada"=>"piña colada",
"Pina colada"=>"Piña colada",
"pinata"=>"piñata",
"Pinata"=>"Piñata",
"pinon"=>"piñón",
"Pinon"=>"Piñón",
"pirana"=>"piraña",
"Pirana"=>"Piraña",
"pique"=>"piqué",
"Pique"=>"Piqué",
"piu"=>"più",
"Piu"=>"Più",
"plie"=>"plié",
"Plie"=>"Plié",
"precis"=>"précis",
"Precis"=>"Précis",
"polsa"=>"pölsa",
"Polsa"=>"Pölsa",
"premiere"=>"première",
"Premiere"=>"Première",
"pret-a-porter"=>"prêt-à-porter",
"Pret-a-porter"=>"Prêt-à-porter",
"protege"=>"protégé",
"Protege"=>"Protégé",
"protegee"=>"protégée",
"Protegee"=>"Protégée",
"puree"=>"purée",
"Puree"=>"Purée",
"Quebecois"=>"Québécois",
"raison d'etre"=>"raison d'être",
"Raison d'etre"=>"Raison d'être",
"recherche"=>"recherché",
"Recherche"=>"Recherché",
"reclame"=>"réclame",
"Reclame"=>"Réclame",
"regime"=>"régime",
"Regime"=>"Régime",
"retrousse"=>"retroussé",
"Retrousse"=>"Retroussé",
"risque"=>"risqué",
"Risque"=>"Risqué",
"riviere"=>"rivière",
"Riviere"=>"Rivière",
"roman a clef"=>"roman à clef",
"Roman a clef"=>"Roman à clef",
"roue"=>"roué",
"Roue"=>"Roué",
"saute"=>"sauté",
"Saute"=>"Sauté",
"seance"=>"séance",
"Seance"=>"Séance",
"senor"=>"señor",
"Senor"=>"Señor",
"senora"=>"señora",
"Senora"=>"Señora",
"senorita"=>"señorita",
"Senorita"=>"Señorita",
"Sinn Fein"=>"Sinn Féin",
"smorgasbord"=>"smörgåsbord",
"Smorgasbord"=>"Smörgåsbord",
"smorgastarta"=>"smörgåstårta",
"Smorgastarta"=>"Smörgåstårta",
"soigne"=>"soigné",
"Soigne"=>"Soigné",
"soiree"=>"soirée",
"Soiree"=>"Soirée",
"souffle"=>"soufflé",
"Souffle"=>"Soufflé",
"soupcon"=>"soupçon",
"Soupcon"=>"Soupçon",
"surstromming"=>"surströmming",
"Surstromming"=>"Surströmming",
"tete-a-tete"=>"tête-à-tête",
"Tete-a-tete"=>"Tête-à-tête",
"touche"=>"touché",
"Touche"=>"Touché",
"tourtiere"=>"tourtière",
"Tourtiere"=>"Tourtière",
"uber"=>"über",
"Uber"=>"Über",
"Ubermensch"=>"Übermensch",
"Zaire"=>"Zaïre",
);

1735
php-typography/lang/bg.php Normal file

File diff suppressed because it is too large Load Diff

1052
php-typography/lang/ca.php Normal file

File diff suppressed because it is too large Load Diff

3719
php-typography/lang/cs.php Normal file

File diff suppressed because it is too large Load Diff

6824
php-typography/lang/cy.php Normal file

File diff suppressed because it is too large Load Diff

1250
php-typography/lang/da.php Normal file

File diff suppressed because it is too large Load Diff

14576
php-typography/lang/de.php Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,599 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-el-monoton.tex
//============================================================================================================
ORIGINAL FILE INFO
% ****************************************************************
%
% File name: grmhyph5-unicode.tex
%
% This file was first created by mechanical translation from
% GRMhyph5.tex via 'elhyph-utf8 -m -c' (version 0.1 by Peter
% Heslin -- p.j.heslin@durham.ac.uk). Some additions were
% also made by hand.
%
% Created: June 6, 2008
%
% Hyphenation patterns for Modern Monotonic Greek.
%
% Created by Dimitrios Filippou with some ideas borrowed from
% Yannis Haralambous, Kostis Dryllerakis and Claudio Beccari.
%
% These hyphenation patterns are explained in 'ancient.pdf'.
% Hyphenation examples are given in the file 'anc-test.pdf'.
% Some doubtful patterns are marked by three question marks '???'.
%
% Documentation in English can be found in: D. Filippou,
% 'Hyphenation patterns for Ancient and Modern Greek, ' in
% 'TeX, XML, and Digital Typography' (A. Syropoulos et al.,
% eds.), Lecture Notes in Computer Science 3130, Springer-Verlag
% Berlin-Heidelberg, 2004. ISBN 3-540-22801-2.
%
% ****************************************************************
%
% \message{UTF-8 hyphenation patterns for Modern, Monotonic Greek}
//============================================================================================================
*/
$patgenLanguage = 'Greek (Modern Monotonic)';
$patgenExceptions = array();
$patgenMaxSeg = 4;
$patgen = array(
'begin'=>array(
'ι'=>'03',
'ί'=>'03',
'η'=>'03',
'ή'=>'03',
'υ'=>'03',
'ύ'=>'03',
'β'=>'04',
'γ'=>'04',
'δ'=>'04',
'ζ'=>'04',
'θ'=>'04',
'κ'=>'04',
'λ'=>'04',
'μ'=>'04',
'ν'=>'04',
'ξ'=>'04',
'π'=>'04',
'ρ'=>'04',
'σ'=>'04',
'ϲ'=>'04',
'τ'=>'04',
'φ'=>'04',
'χ'=>'04',
'ψ'=>'04'
),
'end'=>array(
'άη'=>'030',
'άι'=>'030',
'όη'=>'030',
'όι'=>'030',
'β'=>'40',
'γ'=>'40',
'γκ'=>'400',
'δ'=>'40',
'ζ'=>'40',
'θ'=>'40',
'κ'=>'40',
'λ'=>'40',
'μ'=>'40',
'μπ'=>'400',
'ν'=>'40',
'ντ'=>'400',
'ξ'=>'40',
'π'=>'40',
'ρ'=>'40',
'σ'=>'40',
'ϲ'=>'40',
'ς'=>'40',
'τ'=>'40',
'τζ'=>'400',
'τσ'=>'400',
'τϲ'=>'400',
'τς'=>'400',
'φ'=>'40',
'χ'=>'40',
'ψ'=>'40',
'βρ'=>'400',
'γλ'=>'400',
'κλ'=>'400',
'κτ'=>'400',
'γκς'=>'4000',
'γκϲ'=>'4000',
'γκσ'=>'4000',
'κς'=>'600',
'κϲ'=>'600',
'κσ'=>'400',
'λς'=>'400',
'λϲ'=>'400',
'λσ'=>'400',
'μπλ'=>'4000',
'μπν'=>'4000',
'μπρ'=>'4000',
'μς'=>'400',
'μϲ'=>'400',
'μσ'=>'400',
'νς'=>'400',
'νϲ'=>'400',
'νσ'=>'400',
'ρς'=>'400',
'ρϲ'=>'400',
'ρσ'=>'400',
'σκ'=>'400',
'ϲκ'=>'400',
'στ'=>'400',
'ϲτ'=>'400',
'τλ'=>'400',
'τρ'=>'400',
'ντς'=>'4000',
'ντϲ'=>'4000',
'ντσ'=>'4000',
'φτ'=>'400',
'χτ'=>'400'
),
'all'=>array(
'α'=>'01',
'ε'=>'01',
'η'=>'01',
'ι'=>'01',
'ο'=>'01',
'υ'=>'01',
'ω'=>'01',
'ϊ'=>'01',
'ϋ'=>'01',
'ά'=>'01',
'έ'=>'01',
'ή'=>'01',
'ί'=>'01',
'ό'=>'01',
'ύ'=>'01',
'ώ'=>'01',
'ΐ'=>'01',
'ΰ'=>'01',
'αι'=>'020',
'αί'=>'020',
'άι'=>'020',
'άϊ'=>'020',
'αυ'=>'020',
'αύ'=>'020',
'άυ'=>'030',
'ει'=>'020',
'εί'=>'020',
'έι'=>'020',
'έϊ'=>'020',
'ευ'=>'020',
'εύ'=>'020',
'έυ'=>'030',
'ηυ'=>'020',
'ηύ'=>'020',
'ήυ'=>'030',
'οι'=>'020',
'οί'=>'020',
'όι'=>'020',
'όϊ'=>'020',
'ου'=>'020',
'ού'=>'020',
'όυ'=>'030',
'υι'=>'020',
'υί'=>'020',
'ύι'=>'030',
'αη'=>'020',
'αϊ'=>'020',
'αϋ'=>'020',
'εϊ'=>'020',
'εϋ'=>'020',
'οει'=>'0200',
'οη'=>'020',
'οϊ'=>'020',
'ια'=>'020',
'ιά'=>'020',
'ιε'=>'020',
'ιέ'=>'020',
'ιο'=>'020',
'ιό'=>'020',
'οϊό'=>'0330',
'ιω'=>'020',
'ιώ'=>'020',
'ηα'=>'020',
'ηά'=>'020',
'ηε'=>'020',
'ηέ'=>'020',
'ηο'=>'020',
'ηό'=>'020',
'ηω'=>'020',
'ηώ'=>'020',
'υα'=>'020',
'υά'=>'020',
'υο'=>'020',
'υό'=>'020',
'υω'=>'020',
'υώ'=>'020',
'\''=>'40',
'ʼ'=>'40',
'᾿'=>'40',
'β\''=>'400',
'βʼ'=>'400',
'β᾿'=>'400',
'γ\''=>'400',
'γʼ'=>'400',
'γ᾿'=>'400',
'δ\''=>'400',
'δʼ'=>'400',
'δ᾿'=>'400',
'ζ\''=>'400',
'ζʼ'=>'400',
'ζ᾿'=>'400',
'θ\''=>'400',
'θʼ'=>'400',
'θ᾿'=>'400',
'κ\''=>'400',
'κʼ'=>'400',
'κ᾿'=>'400',
'λ\''=>'400',
'λʼ'=>'400',
'λ᾿'=>'400',
'μ\''=>'400',
'μʼ'=>'400',
'μ᾿'=>'400',
'μπ\''=>'4000',
'μπʼ'=>'4000',
'μπ᾿'=>'4000',
'ν\''=>'400',
'νʼ'=>'400',
'ν᾿'=>'400',
'ντ\''=>'4000',
'ντ’'=>'4000',
'ντ᾿'=>'4000',
'ξ\''=>'400',
'ξʼ'=>'400',
'ξ᾿'=>'400',
'π\''=>'400',
'πʼ'=>'400',
'π᾿'=>'400',
'ρ\''=>'400',
'ρʼ'=>'400',
'ρ᾿'=>'400',
'σ\''=>'400',
'σʼ'=>'400',
'σ᾿'=>'400',
'ϲ\''=>'400',
'ϲʼ'=>'400',
'ϲ᾿'=>'400',
'τ\''=>'400',
'τʼ'=>'400',
'τ᾿'=>'400',
'τζ\''=>'4000',
'τζʼ'=>'4000',
'τζ᾿'=>'4000',
'τσ\''=>'4000',
'τσʼ'=>'4000',
'τσ᾽'=>'4000',
'τϲ\''=>'4000',
'τϲʼ'=>'4000',
'τϲ᾿'=>'4000',
'φ\''=>'400',
'φʼ'=>'400',
'φ᾿'=>'400',
'χ\''=>'400',
'χʼ'=>'400',
'χ᾿'=>'400',
'ψ\''=>'400',
'ψʼ'=>'400',
'ψ᾿'=>'400',
'ββ'=>'410',
'γγ'=>'410',
'δδ'=>'410',
'ζζ'=>'410',
'θθ'=>'410',
'κκ'=>'410',
'λλ'=>'410',
'μμ'=>'410',
'νν'=>'410',
'ππ'=>'410',
'ρρ'=>'410',
'σσ'=>'410',
'ϲϲ'=>'410',
'ττ'=>'410',
'φφ'=>'410',
'χχ'=>'410',
'ψψ'=>'410',
'βζ'=>'410',
'βθ'=>'410',
'βκ'=>'410',
'βμ'=>'410',
'βν'=>'410',
'βξ'=>'410',
'βπ'=>'410',
'βσ'=>'410',
'βϲ'=>'410',
'βτ'=>'410',
'βφ'=>'410',
'βχ'=>'410',
'βψ'=>'410',
'γβ'=>'410',
'γζ'=>'410',
'γθ'=>'410',
'γμ'=>'410',
'ργμ'=>'4520',
'γξ'=>'410',
'γπ'=>'410',
'γσ'=>'410',
'γϲ'=>'410',
'γτ'=>'410',
'γφ'=>'410',
'γχ'=>'410',
'γψ'=>'410',
'δβ'=>'410',
'δγ'=>'410',
'δζ'=>'410',
'δθ'=>'410',
'δκ'=>'410',
'δλ'=>'410',
'δξ'=>'410',
'δπ'=>'410',
'δσ'=>'410',
'δϲ'=>'410',
'δτ'=>'410',
'δφ'=>'410',
'δχ'=>'410',
'δψ'=>'410',
'ζβ'=>'410',
'ζγ'=>'410',
'ζδ'=>'410',
'ζθ'=>'410',
'ζκ'=>'410',
'ζλ'=>'410',
'ζμ'=>'410',
'τζμ'=>'0020',
'ζν'=>'410',
'ζξ'=>'410',
'ζπ'=>'410',
'ζρ'=>'410',
'ζσ'=>'410',
'ζϲ'=>'410',
'ζτ'=>'410',
'ζφ'=>'410',
'ζχ'=>'410',
'ζψ'=>'410',
'θβ'=>'410',
'θγ'=>'410',
'θδ'=>'410',
'θζ'=>'410',
'θκ'=>'410',
'θμ'=>'410',
'ρθμ'=>'4520',
'σθμ'=>'0020',
'ϲθμ'=>'0020',
'θξ'=>'410',
'θπ'=>'410',
'θσ'=>'410',
'θϲ'=>'410',
'θτ'=>'410',
'θφ'=>'410',
'θχ'=>'410',
'θψ'=>'410',
'κβ'=>'410',
'κγ'=>'410',
'κδ'=>'410',
'κζ'=>'410',
'κθ'=>'410',
'κμ'=>'410',
'λκμ'=>'4520',
'ρκμ'=>'4520',
'κξ'=>'410',
'κπ'=>'410',
'κσ'=>'410',
'κϲ'=>'410',
'κφ'=>'410',
'νκφ'=>'4520',
'κχ'=>'410',
'κψ'=>'410',
'λβ'=>'410',
'λγ'=>'410',
'λδ'=>'410',
'λζ'=>'410',
'λθ'=>'410',
'λκ'=>'410',
'λμ'=>'410',
'λν'=>'410',
'λξ'=>'410',
'λπ'=>'410',
'λρ'=>'410',
'λσ'=>'410',
'λϲ'=>'410',
'λτ'=>'410',
'λφ'=>'410',
'λχ'=>'410',
'λψ'=>'410',
'μβ'=>'410',
'μγ'=>'410',
'μδ'=>'410',
'μζ'=>'410',
'μθ'=>'410',
'μκ'=>'410',
'μλ'=>'410',
'μξ'=>'410',
'μρ'=>'410',
'μσ'=>'410',
'μϲ'=>'410',
'μτ'=>'410',
'μφ'=>'410',
'μχ'=>'410',
'μψ'=>'410',
'νβ'=>'410',
'νγ'=>'410',
'νδ'=>'410',
'νζ'=>'410',
'νθ'=>'410',
'νκ'=>'410',
'νλ'=>'410',
'νμ'=>'410',
'νξ'=>'410',
'νπ'=>'410',
'νρ'=>'410',
'νσ'=>'410',
'νϲ'=>'410',
'νφ'=>'410',
'νχ'=>'410',
'νψ'=>'410',
'ξβ'=>'410',
'ξγ'=>'410',
'ξδ'=>'410',
'ξζ'=>'410',
'ξθ'=>'410',
'ξκ'=>'410',
'ξλ'=>'410',
'ξμ'=>'410',
'ξν'=>'410',
'ξπ'=>'410',
'ξρ'=>'410',
'ξσ'=>'410',
'ξϲ'=>'410',
'ξτ'=>'410',
'γξτ'=>'4520',
'ρξτ'=>'4520',
'ξφ'=>'410',
'ξχ'=>'410',
'ξψ'=>'410',
'πβ'=>'410',
'πγ'=>'410',
'πδ'=>'410',
'πζ'=>'410',
'πθ'=>'410',
'πκ'=>'410',
'πμ'=>'410',
'πξ'=>'410',
'πσ'=>'410',
'πϲ'=>'410',
'πφ'=>'410',
'πχ'=>'410',
'πψ'=>'410',
'ρβ'=>'410',
'ργ'=>'410',
'ρδ'=>'410',
'ρζ'=>'410',
'ρθ'=>'410',
'ρκ'=>'410',
'ρλ'=>'410',
'ρμ'=>'410',
'ρν'=>'410',
'ρξ'=>'410',
'ρπ'=>'410',
'ρσ'=>'410',
'ρϲ'=>'410',
'ρτ'=>'410',
'ρφ'=>'410',
'ρχ'=>'410',
'ρψ'=>'410',
'σδ'=>'410',
'ϲδ'=>'410',
'σζ'=>'410',
'ϲζ'=>'410',
'σν'=>'410',
'ϲν'=>'410',
'σξ'=>'410',
'ϲξ'=>'410',
'σρ'=>'410',
'ϲρ'=>'410',
'σψ'=>'410',
'ϲψ'=>'410',
'τβ'=>'410',
'τγ'=>'410',
'τδ'=>'410',
'τθ'=>'410',
'τκ'=>'410',
'τν'=>'410',
'τξ'=>'410',
'τπ'=>'410',
'τφ'=>'410',
'στφ'=>'0020',
'ϲτφ'=>'0020',
'τχ'=>'410',
'τψ'=>'410',
'φβ'=>'410',
'φγ'=>'410',
'φδ'=>'410',
'φζ'=>'410',
'φκ'=>'410',
'φμ'=>'410',
'φν'=>'410',
'ρφν'=>'4520',
'φξ'=>'410',
'φπ'=>'410',
'φσ'=>'410',
'φϲ'=>'410',
'φχ'=>'410',
'φψ'=>'410',
'χβ'=>'410',
'χγ'=>'410',
'χδ'=>'410',
'χζ'=>'410',
'χκ'=>'410',
'χμ'=>'410',
'ρχμ'=>'4520',
'χξ'=>'410',
'χπ'=>'410',
'χσ'=>'410',
'χϲ'=>'410',
'χφ'=>'410',
'χψ'=>'410',
'ψβ'=>'410',
'ψγ'=>'410',
'ψδ'=>'410',
'ψζ'=>'410',
'ψθ'=>'410',
'ψκ'=>'410',
'ψλ'=>'410',
'ψμ'=>'410',
'ψν'=>'410',
'ψξ'=>'410',
'ψπ'=>'410',
'ψρ'=>'410',
'ψσ'=>'410',
'ψϲ'=>'410',
'ψτ'=>'410',
'μψτ'=>'4520',
'ψφ'=>'410',
'ψχ'=>'410',
'γκφ'=>'4520',
'γκτ'=>'4100',
'μπτ'=>'4100',
'ντζ'=>'4100',
'ντσ'=>'4100',
'ντϲ'=>'4100',
'γκμπ'=>'40100',
'γκντ'=>'40100',
'γκτζ'=>'40100',
'γκτσ'=>'40100',
'γκτϲ'=>'40100',
'μπντ'=>'40100',
'μπτζ'=>'40100',
'μπτσ'=>'40100',
'μπτϲ'=>'40100',
'ντμπ'=>'40100',
'τσγκ'=>'40100',
'τϲγκ'=>'40100',
'τσμπ'=>'40100',
'τϲμπ'=>'40100',
'τσντ'=>'40100',
'τϲντ'=>'40100'
)
);
?>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

3051
php-typography/lang/es.php Normal file

File diff suppressed because it is too large Load Diff

3803
php-typography/lang/et.php Normal file

File diff suppressed because it is too large Load Diff

268
php-typography/lang/eu.php Normal file
View File

@ -0,0 +1,268 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-eu.tex
//============================================================================================================
ORIGINAL FILE INFO
% Hyphenation patterns for Basque.
%
% This file has first been written by Juan M. Aguirregabiria
% (juanmari.aguirregabiria@ehu.es) on February 1997 based on the
% shyphen.sh script that generates the Spanish patterns as compiled
% by Julio Sanchez (jsanchez@gmv.es) on September 1991.
%
% In June 2008 the generating script has been rewritten into ruby and
% adapted for native UTF-8 TeX engines. Patterns became part of hyph-utf8
% package and were renamed from bahyph.tex into hyph-eu.tex.
% Functionality should not change apart from adding ñ by default.
%
% The original Copyright followed and applied also to precessor of this file
% whose last version will be always available by anonymous ftp
% from tp.lc.ehu.es or by poynting your Web browser to
% http://tp.lc.ehu.es/jma/basque.html
%
% For more information about the new UTF-8 hyphenation patterns and
% links to this file see
% http://www.tug.org/tex-hyphen/
%
% COPYRIGHT NOTICE
%
% These patterns and the generating script are Copyright (c) JMA 1997, 2008
% These patterns are made public in the hope that they will benefit others.
% You can use this software for any purpose.
% However, this is given for free and WITHOUT ANY WARRANTY.
%
% You are kindly requested to send any changes to the author.
% If you change the generating script, you must include code
% in it such that any output is clearly labeled as generated
% by a modified script.
%
% END OF COPYRIGHT NOTICE
%
% Open vowels: a e o
% Closed vowels: i u
% Consonants: b c d f g j k l m n ñ p q r s t v w x y z
%
% Some of the patterns below represent combinations that never
% happen in Basque. Would they happen, they would be hyphenated
% according to the rules.
%
//============================================================================================================
*/
$patgenLanguage = 'Basque';
$patgenExceptions = array();
$patgenMaxSeg = 4;
$patgen = array(
'begin'=>array(),
'end'=>array(),
'all'=>array(
'ba'=>'100',
'be'=>'100',
'bo'=>'100',
'bi'=>'100',
'bu'=>'100',
'ca'=>'100',
'ce'=>'100',
'co'=>'100',
'ci'=>'100',
'cu'=>'100',
'da'=>'100',
'de'=>'100',
'do'=>'100',
'di'=>'100',
'du'=>'100',
'fa'=>'100',
'fe'=>'100',
'fo'=>'100',
'fi'=>'100',
'fu'=>'100',
'ga'=>'100',
'ge'=>'100',
'go'=>'100',
'gi'=>'100',
'gu'=>'100',
'ja'=>'100',
'je'=>'100',
'jo'=>'100',
'ji'=>'100',
'ju'=>'100',
'ka'=>'100',
'ke'=>'100',
'ko'=>'100',
'ki'=>'100',
'ku'=>'100',
'la'=>'100',
'le'=>'100',
'lo'=>'100',
'li'=>'100',
'lu'=>'100',
'ma'=>'100',
'me'=>'100',
'mo'=>'100',
'mi'=>'100',
'mu'=>'100',
'na'=>'100',
'ne'=>'100',
'no'=>'100',
'ni'=>'100',
'nu'=>'100',
'ña'=>'100',
'ñe'=>'100',
'ño'=>'100',
'ñi'=>'100',
'ñu'=>'100',
'pa'=>'100',
'pe'=>'100',
'po'=>'100',
'pi'=>'100',
'pu'=>'100',
'qa'=>'100',
'qe'=>'100',
'qo'=>'100',
'qi'=>'100',
'qu'=>'100',
'ra'=>'100',
're'=>'100',
'ro'=>'100',
'ri'=>'100',
'ru'=>'100',
'sa'=>'100',
'se'=>'100',
'so'=>'100',
'si'=>'100',
'su'=>'100',
'ta'=>'100',
'te'=>'100',
'to'=>'100',
'ti'=>'100',
'tu'=>'100',
'va'=>'100',
've'=>'100',
'vo'=>'100',
'vi'=>'100',
'vu'=>'100',
'wa'=>'100',
'we'=>'100',
'wo'=>'100',
'wi'=>'100',
'wu'=>'100',
'xa'=>'100',
'xe'=>'100',
'xo'=>'100',
'xi'=>'100',
'xu'=>'100',
'ya'=>'100',
'ye'=>'100',
'yo'=>'100',
'yi'=>'100',
'yu'=>'100',
'za'=>'100',
'ze'=>'100',
'zo'=>'100',
'zi'=>'100',
'zu'=>'100',
'lla'=>'1200',
'lle'=>'1200',
'llo'=>'1200',
'lli'=>'1200',
'llu'=>'1200',
'rra'=>'1200',
'rre'=>'1200',
'rro'=>'1200',
'rri'=>'1200',
'rru'=>'1200',
'tsa'=>'1200',
'tse'=>'1200',
'tso'=>'1200',
'tsi'=>'1200',
'tsu'=>'1200',
'txa'=>'1200',
'txe'=>'1200',
'txo'=>'1200',
'txi'=>'1200',
'txu'=>'1200',
'tza'=>'1200',
'tze'=>'1200',
'tzo'=>'1200',
'tzi'=>'1200',
'tzu'=>'1200',
'bla'=>'1200',
'ble'=>'1200',
'blo'=>'1200',
'bli'=>'1200',
'blu'=>'1200',
'bra'=>'1200',
'bre'=>'1200',
'bro'=>'1200',
'bri'=>'1200',
'bru'=>'1200',
'dra'=>'1200',
'dre'=>'1200',
'dro'=>'1200',
'dri'=>'1200',
'dru'=>'1200',
'fla'=>'1200',
'fle'=>'1200',
'flo'=>'1200',
'fli'=>'1200',
'flu'=>'1200',
'fra'=>'1200',
'fre'=>'1200',
'fro'=>'1200',
'fri'=>'1200',
'fru'=>'1200',
'gla'=>'1200',
'gle'=>'1200',
'glo'=>'1200',
'gli'=>'1200',
'glu'=>'1200',
'gra'=>'1200',
'gre'=>'1200',
'gro'=>'1200',
'gri'=>'1200',
'gru'=>'1200',
'kla'=>'1200',
'kle'=>'1200',
'klo'=>'1200',
'kli'=>'1200',
'klu'=>'1200',
'kra'=>'1200',
'kre'=>'1200',
'kro'=>'1200',
'kri'=>'1200',
'kru'=>'1200',
'pla'=>'1200',
'ple'=>'1200',
'plo'=>'1200',
'pli'=>'1200',
'plu'=>'1200',
'pra'=>'1200',
'pre'=>'1200',
'pro'=>'1200',
'pri'=>'1200',
'pru'=>'1200',
'tra'=>'1200',
'tre'=>'1200',
'tro'=>'1200',
'tri'=>'1200',
'tru'=>'1200',
'subr'=>'00220',
'subl'=>'00220'
)
);
?>

396
php-typography/lang/fi.php Normal file
View File

@ -0,0 +1,396 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-_______________.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: fihyph.tex (yyyy-mm-dd)
% Author: Kauko Saarinen
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
% -----> Finnish hyphenation patterns for MLPCTeX <------
% First release January -86 by Kauko Saarinen,
% Computing Centre, University of Jyvaskyla, Finland
%
% Completely rewritten January -88. The new patterns make
% much less mistakes with foreign and compound words.
% The article 'Automatic Hyphenation of Finnish'
% by Professor Fred Karlsson is also referred
% ---------------------------------------------------------
%
% 8th March -89 (vers. 2.2), some vowel triples by Fred Karlsson added.
% 9th January - 95: added \uccode and \lccode by Thomas Esser
%
% ********* Patterns may be freely distributed **********
%
//============================================================================================================
*/
$patgenLanguage = 'Finnish';
$patgenExceptions = array();
$patgenMaxSeg = 7;
$patgen = array(
'begin'=>array(
'ä'=>'02',
'ydin'=>'00021',
'suura'=>'000212'
),
'end'=>array(
'sidea'=>'212000'
),
'all'=>array(
'ba'=>'100',
'be'=>'100',
'bi'=>'100',
'bo'=>'100',
'bu'=>'100',
'by'=>'100',
'da'=>'100',
'de'=>'100',
'di'=>'100',
'do'=>'100',
'du'=>'100',
'dy'=>'100',
'dä'=>'100',
'dö'=>'100',
'fa'=>'100',
'fe'=>'100',
'fi'=>'100',
'fo'=>'100',
'fu'=>'100',
'fy'=>'100',
'ga'=>'100',
'ge'=>'100',
'gi'=>'100',
'go'=>'100',
'gu'=>'100',
'gy'=>'100',
'gä'=>'100',
'gö'=>'100',
'ha'=>'100',
'he'=>'100',
'hi'=>'100',
'ho'=>'100',
'hu'=>'100',
'hy'=>'100',
'hä'=>'100',
'hö'=>'100',
'ja'=>'100',
'je'=>'100',
'ji'=>'100',
'jo'=>'100',
'ju'=>'100',
'jy'=>'100',
'jä'=>'100',
'jö'=>'100',
'ka'=>'100',
'ke'=>'100',
'ki'=>'100',
'ko'=>'100',
'ku'=>'100',
'ky'=>'100',
'kä'=>'100',
'kö'=>'100',
'la'=>'100',
'le'=>'100',
'li'=>'100',
'lo'=>'100',
'lu'=>'100',
'ly'=>'100',
'lä'=>'100',
'lö'=>'100',
'ma'=>'100',
'me'=>'100',
'mi'=>'100',
'mo'=>'100',
'mu'=>'100',
'my'=>'100',
'mä'=>'100',
'mö'=>'100',
'na'=>'100',
'ne'=>'100',
'ni'=>'100',
'no'=>'100',
'nu'=>'100',
'ny'=>'100',
'nä'=>'100',
'nö'=>'100',
'pa'=>'100',
'pe'=>'100',
'pi'=>'100',
'po'=>'100',
'pu'=>'100',
'py'=>'100',
'pä'=>'100',
'pö'=>'100',
'ra'=>'100',
're'=>'100',
'ri'=>'100',
'ro'=>'100',
'ru'=>'100',
'ry'=>'100',
'rä'=>'100',
'rö'=>'100',
'sa'=>'100',
'se'=>'100',
'si'=>'100',
'so'=>'100',
'su'=>'100',
'sy'=>'100',
'sä'=>'100',
'sö'=>'100',
'ta'=>'100',
'te'=>'100',
'ti'=>'100',
'to'=>'100',
'tu'=>'100',
'ty'=>'100',
'tä'=>'100',
'tö'=>'100',
'va'=>'100',
've'=>'100',
'vi'=>'100',
'vo'=>'100',
'vu'=>'100',
'vy'=>'100',
'vä'=>'100',
'vö'=>'100',
'str'=>'1020',
'äy'=>'020',
'ya'=>'012',
'yo'=>'012',
'oy'=>'010',
'öy'=>'020',
'uy'=>'012',
'yu'=>'012',
'öa'=>'032',
'öo'=>'032',
'äa'=>'032',
'äo'=>'032',
'äu'=>'012',
'öu'=>'012',
'aä'=>'010',
'aö'=>'010',
'oä'=>'010',
'oö'=>'010',
'uä'=>'012',
'uö'=>'012',
'ää'=>'020',
'öö'=>'020',
'äö'=>'020',
'öä'=>'020',
'aai'=>'0012',
'aae'=>'0012',
'aao'=>'0012',
'aau'=>'0012',
'eea'=>'0012',
'eei'=>'0012',
'eeu'=>'0012',
'eey'=>'0012',
'iia'=>'0012',
'iie'=>'0012',
'iio'=>'0012',
'uua'=>'0012',
'uue'=>'0012',
'uuo'=>'0012',
'uui'=>'0012',
'eaa'=>'0100',
'iaa'=>'0100',
'oaa'=>'0100',
'uaa'=>'0100',
'uee'=>'0100',
'auu'=>'0100',
'iuu'=>'0100',
'euu'=>'0100',
'ouu'=>'0100',
'ääi'=>'0010',
'ääe'=>'0010',
'ääy'=>'0030',
'iää'=>'0100',
'eää'=>'0100',
'yää'=>'0100',
'iöö'=>'0100',
'aei'=>'0100',
'aoi'=>'0100',
'eai'=>'0100',
'iau'=>'0100',
'yei'=>'0100',
'aia'=>'0010',
'aie'=>'0010',
'aio'=>'0010',
'aiu'=>'0010',
'aua'=>'0010',
'aue'=>'0010',
'eua'=>'0010',
'iea'=>'0010',
'ieo'=>'0010',
'iey'=>'0010',
'ioa'=>'0012',
'ioe'=>'0012',
'iua'=>'0010',
'iue'=>'0010',
'iuo'=>'0010',
'oia'=>'0010',
'oie'=>'0010',
'oio'=>'0010',
'oiu'=>'0010',
'oui'=>'0100',
'oue'=>'0010',
'ouo'=>'0010',
'uea'=>'0010',
'uie'=>'0010',
'uoa'=>'0010',
'uou'=>'0010',
'eö'=>'012',
'öe'=>'012',
'us'=>'020',
'yliop'=>'000120',
'aliav'=>'000120',
'spli'=>'10200',
'alous'=>'000001',
'keus'=>'00001',
'rtaus'=>'000001',
'sohje'=>'210000',
'sasia'=>'212000',
'asian'=>'120000',
'asiat'=>'120000',
'asioi'=>'120000',
'ras'=>'0200',
'las'=>'0200',
'sopisk'=>'2120000',
'nopet'=>'212000',
'saloi'=>'212000',
'nopist'=>'2120000',
'sopist'=>'2120000',
'sosa'=>'21200',
'nosa'=>'21200',
'alkeis'=>'0000021',
'perus'=>'000001',
'sidean'=>'2120000',
'sesity'=>'2120000',
'nedus'=>'212000',
'sajatu'=>'2100000',
'sase'=>'21000',
'sapu'=>'21000',
'syrit'=>'212000',
'syhti'=>'212000',
'notto'=>'210000',
'noton'=>'210000',
'nanto'=>'210000',
'nanno'=>'210000',
'najan'=>'212000',
'naika'=>'210000',
'nomai'=>'212000',
'nylit'=>'212000',
'salen'=>'212000',
'nalen'=>'212000',
'asiakas'=>'12000021',
'ulos'=>'00021',
'najo'=>'21200',
'sajo'=>'21200',
'bl'=>'020',
'blo'=>'1200',
'bibli'=>'000300',
'br'=>'020',
'bri'=>'1200',
'bro'=>'1200',
'bru'=>'1200',
'dr'=>'020',
'dra'=>'1200',
'fl'=>'020',
'fla'=>'1200',
'fr'=>'020',
'fra'=>'1200',
'fre'=>'1200',
'gl'=>'020',
'glo'=>'1200',
'gr'=>'020',
'gra'=>'1200',
'kl'=>'020',
'kra'=>'1200',
'kre'=>'1200',
'kri'=>'1200',
'kv'=>'120',
'kva'=>'1200',
'pl'=>'020',
'pr'=>'020',
'pro'=>'1200',
'cl'=>'020',
'qv'=>'020',
'qvi'=>'1200',
'sch'=>'0020',
'tsh'=>'0020',
'chr'=>'0020'
)
);
?>

1279
php-typography/lang/fr.php Normal file

File diff suppressed because it is too large Load Diff

6212
php-typography/lang/ga.php Normal file

File diff suppressed because it is too large Load Diff

2383
php-typography/lang/gl.php Normal file

File diff suppressed because it is too large Load Diff

3097
php-typography/lang/grc.php Normal file

File diff suppressed because it is too large Load Diff

1578
php-typography/lang/hr.php Normal file

File diff suppressed because it is too large Load Diff

13505
php-typography/lang/hu.php Normal file

File diff suppressed because it is too large Load Diff

771
php-typography/lang/ia.php Normal file
View File

@ -0,0 +1,771 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-ia.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: iahyphen.tex (2005-06-28)
% Author: Peter Kleiweg <p.c.j.kleiweg at rug.nl>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
% File: iahyphen.tex
% TeX hyphenation patterns for Interlingua.
% Version 0.2b. Released 3 July 2001.
% version 0.2c Released 28 June 2005 (added LPPL header)
% Created by Peter Kleiweg, p.c.j.kleiweg at rug.nl
% About Interlingua: http://www.interlingua.com/
%
% \iffalse meta-comment
%
% Copyright 1989-2005 Peter Kleiweg. All rights reserved.
%
% This file is distributed as part of the Babel system.
% -----------------------------------------------------
%
% It may be distributed and/or modified under the
% conditions of the LaTeX Project Public License, either version 1.3
% of this license or (at your option) any later version.
% The latest version of this license is in
% http://www.latex-project.org/lppl.txt
% and version 1.3 or later is part of all distributions of LaTeX
% version 2003/12/01 or later.
%
% This work has the LPPL maintenance status 'maintained'.
%
% The Current Maintainer of this work is Peter Kleiweg.
%
% The list of all files belonging to the Babel system is
% given in the file `manifest.bbl. See also `legal.bbl' for additional
% information.
//============================================================================================================
*/
$patgenLanguage = 'Interlingua';
$patgenExceptions = array(
'alcun'=>'alc-un',
'alcunissime'=>'alc-u-nis-si-me',
'alcunmente'=>'alc-un-men-te',
'alicun'=>'a-lic-un',
'alicunissime'=>'a-lic-u-nis-si-me',
'alicunmente'=>'a-lic-un-men-te',
'moslem'=>'mos-lem',
'qualcun'=>'qualc-un',
'qualcunissime'=>'qualc-u-nis-si-me',
'qualcunmente'=>'qualc-un-men-te'
);
$patgenMaxSeg = 4;
$patgen = array(
'begin'=>array(
'ch'=>'002',
'des'=>'0040',
'in'=>'001',
'sei'=>'0040'),
'end'=>array(),
'all'=>array(
'aa'=>'010',
'ab'=>'010',
'abl'=>'0210',
'ablo'=>'03400',
'aca'=>'0100',
'ace'=>'0100',
'ach'=>'0100',
'achr'=>'04000',
'aco'=>'0100',
'acr'=>'0100',
'acu'=>'0100',
'ad'=>'010',
'adm'=>'1000',
'adv'=>'1000',
'ae'=>'001',
'ael'=>'0100',
'aero'=>'00003',
'ag'=>'010',
'aged'=>'04300',
'agg'=>'1000',
'ah'=>'010',
'aic'=>'0100',
'ais'=>'0100',
'aiv'=>'0100',
'aj'=>'010',
'ak'=>'010',
'ala'=>'0100',
'ale'=>'0100',
'alei'=>'00300',
'alo'=>'0100',
'alu'=>'0100',
'am'=>'010',
'anim'=>'30000',
'ansp'=>'00400',
'ao'=>'010',
'ap'=>'010',
'aq'=>'010',
'ara'=>'0100',
'ari'=>'0100',
'aro'=>'0100',
'aru'=>'0100',
'ary'=>'0100',
'ash'=>'0120',
'asth'=>'30000',
'at'=>'010',
'atyr'=>'00004',
'av'=>'010',
'aw'=>'010',
'az'=>'010',
'ba'=>'100',
'bb'=>'210',
'bbo'=>'0300',
'bc'=>'010',
'bd'=>'210',
'be'=>'100',
'bh'=>'010',
'bi'=>'100',
'bisa'=>'00430',
'bj'=>'010',
'blu'=>'0100',
'bly'=>'0200',
'bm'=>'010',
'bn'=>'010',
'bo'=>'100',
'bp'=>'010',
'br'=>'120',
'bs'=>'212',
'bt'=>'010',
'bu'=>'100',
'bue'=>'0010',
'bui'=>'0010',
'bv'=>'010',
'cai'=>'0010',
'cc'=>'210',
'cd'=>'010',
'cenn'=>'43000',
'chr'=>'1000',
'chs'=>'2000',
'cht'=>'2000',
'chu'=>'1000',
'ci'=>'100',
'ck'=>'210',
'cl'=>'120',
'cm'=>'210',
'cocl'=>'00400',
'cop'=>'0032',
'cq'=>'010',
'cr'=>'020',
'cs'=>'012',
'ct'=>'210',
'ctro'=>'00003',
'cua'=>'0010',
'cue'=>'0010',
'cui'=>'0010',
'cy'=>'100',
'cyne'=>'00400',
'cyr'=>'0002',
'cz'=>'010',
'da'=>'100',
'dd'=>'210',
'de'=>'100',
'deru'=>'00400',
'dese'=>'00030',
'deso'=>'00430',
'desu'=>'00400',
'dg'=>'210',
'dhe'=>'0100',
'dias'=>'00034',
'dipt'=>'00340',
'disa'=>'00400',
'dise'=>'00430',
'disi'=>'00400',
'diso'=>'00400',
'disu'=>'00430',
'dj'=>'210',
'dm'=>'210',
'do'=>'100',
'dola'=>'00430',
'dosm'=>'43000',
'dr'=>'020',
'dros'=>'00034',
'dua'=>'0010',
'due'=>'0010',
'dui'=>'0010',
'dv'=>'210',
'dys'=>'0020',
'ea'=>'010',
'eau'=>'0200',
'eb'=>'010',
'eca'=>'0100',
'ece'=>'0100',
'eche'=>'03000',
'echi'=>'03000',
'eco'=>'0100',
'ecr'=>'0100',
'ecu'=>'0100',
'ed'=>'010',
'ee'=>'010',
'ef'=>'010',
'eff'=>'1000',
'eg'=>'010',
'eh'=>'010',
'ei'=>'010',
'ej'=>'010',
'ek'=>'010',
'ela'=>'0100',
'ele'=>'0100',
'elo'=>'0100',
'elod'=>'00300',
'elom'=>'04300',
'elu'=>'0100',
'em'=>'010',
'emag'=>'04300',
'enl'=>'2000',
'enop'=>'00034',
'eo'=>'010',
'eog'=>'0032',
'eop'=>'0032',
'eq'=>'010',
'era'=>'0100',
'eri'=>'0100',
'ero'=>'0100',
'erog'=>'40000',
'erop'=>'00034',
'eru'=>'0100',
'erur'=>'00300',
'ery'=>'0100',
'esem'=>'00400',
'est'=>'0200',
'esue'=>'00300',
'et'=>'010',
'eu'=>'001',
'euce'=>'00400',
'eun'=>'0100',
'ev'=>'010',
'ew'=>'010',
'fa'=>'100',
'ff'=>'210',
'fh'=>'210',
'fi'=>'100',
'fl'=>'120',
'fo'=>'100',
'fr'=>'120',
'ft'=>'010',
'fu'=>'100',
'ga'=>'100',
'gd'=>'210',
'ge'=>'100',
'gevi'=>'43000',
'gg'=>'210',
'gi'=>'100',
'gima'=>'43000',
'gl'=>'020',
'gm'=>'210',
'gn'=>'210',
'go'=>'100',
'gr'=>'120',
'gs'=>'212',
'gu'=>'101',
'gym'=>'0002',
'gymn'=>'00003',
'gyna'=>'00400',
'gyra'=>'00430',
'gz'=>'210',
'he'=>'020',
'hec'=>'0002',
'hect'=>'00003',
'heur'=>'00300',
'hloc'=>'03000',
'hm'=>'210',
'hn'=>'010',
'hog'=>'0032',
'hop'=>'0032',
'horh'=>'00300',
'hr'=>'020',
'hs'=>'010',
'ht'=>'010',
'ia'=>'010',
'iala'=>'00430',
'ib'=>'012',
'ic'=>'010',
'id'=>'010',
'ido'=>'0003',
'idop'=>'00004',
'ie'=>'010',
'if'=>'010',
'ig'=>'010',
'ih'=>'010',
'ii'=>'010',
'ik'=>'010',
'il'=>'010',
'im'=>'010',
'imad'=>'04000',
'imb'=>'1000',
'inf'=>'1000',
'inr'=>'1000',
'ins'=>'0002',
'inv'=>'1000',
'io'=>'011',
'iog'=>'0032',
'ios'=>'0002',
'iox'=>'0020',
'ip'=>'010',
'iq'=>'010',
'ira'=>'0100',
'iri'=>'0100',
'iro'=>'0100',
'irop'=>'00034',
'irur'=>'00300',
'isac'=>'00300',
'isas'=>'00300',
'isau'=>'00300',
'iseq'=>'00300',
'ises'=>'00300',
'isil'=>'00300',
'isin'=>'00300',
'isph'=>'03400',
'it'=>'010',
'iu'=>'010',
'iv'=>'010',
'iz'=>'010',
'kale'=>'00400',
'ke'=>'001',
'kra'=>'0001',
'lalg'=>'43000',
'larc'=>'43000',
'lb'=>'010',
'lc'=>'210',
'ld'=>'210',
'lech'=>'00300',
'leid'=>'00400',
'lf'=>'210',
'lg'=>'010',
'lh'=>'210',
'li'=>'100',
'lk'=>'210',
'll'=>'210',
'llur'=>'00300',
'lm'=>'210',
'lmod'=>'04300',
'ln'=>'010',
'lod'=>'2000',
'lodo'=>'03000',
'lopi'=>'40000',
'lp'=>'210',
'lq'=>'010',
'ls'=>'212',
'lt'=>'210',
'ltun'=>'04300',
'lue'=>'0010',
'lui'=>'0010',
'lur'=>'2000',
'lv'=>'210',
'ly'=>'100',
'lych'=>'00300',
'ma'=>'100',
'mb'=>'210',
'mc'=>'010',
'me'=>'100',
'mech'=>'00300',
'mese'=>'00430',
'mf'=>'010',
'mi'=>'100',
'mip'=>'0032',
'misi'=>'00040',
'mj'=>'010',
'ml'=>'010',
'mm'=>'210',
'mmen'=>'00043',
'mn'=>'210',
'mnam'=>'00300',
'mnas'=>'00300',
'mno'=>'0001',
'mnob'=>'00300',
'mnop'=>'00300',
'mo'=>'100',
'mony'=>'43000',
'mop'=>'0032',
'morr'=>'00300',
'mosp'=>'00040',
'most'=>'00340',
'mp'=>'210',
'mps'=>'0300',
'ms'=>'012',
'mu'=>'100',
'mv'=>'210',
'my'=>'100',
'myrr'=>'00400',
'na'=>'100',
'nae'=>'0100',
'nalg'=>'03000',
'nani'=>'03000',
'nap'=>'0120',
'nau'=>'0100',
'nb'=>'010',
'nc'=>'010',
'nd'=>'210',
'ne'=>'100',
'neq'=>'0100',
'nex'=>'0100',
'nf'=>'010',
'ng'=>'010',
'nh'=>'010',
'ni'=>'100',
'niq'=>'0100',
'nisp'=>'00300',
'nit'=>'0200',
'nj'=>'010',
'nl'=>'010',
'nm'=>'010',
'nn'=>'012',
'no'=>'100',
'nobl'=>'00040',
'nosp'=>'03340',
'nox'=>'0100',
'nq'=>'010',
'nr'=>'010',
'ns'=>'010',
'nsie'=>'04300',
'nsir'=>'04000',
'nsl'=>'0200',
'nst'=>'0020',
'nt'=>'010',
'ntah'=>'04300',
'ntap'=>'04300',
'nu'=>'100',
'nua'=>'0010',
'nue'=>'0010',
'nui'=>'0010',
'nv'=>'010',
'ny'=>'100',
'nz'=>'010',
'oa'=>'010',
'ob'=>'010',
'oblo'=>'00300',
'obs'=>'1000',
'oc'=>'010',
'ocle'=>'00300',
'od'=>'010',
'oe'=>'010',
'of'=>'010',
'og'=>'010',
'oh'=>'010',
'oi'=>'010',
'oj'=>'010',
'ol'=>'010',
'omna'=>'00400',
'ona'=>'0020',
'ono'=>'0001',
'onos'=>'00004',
'ons'=>'0002',
'oo'=>'010',
'op'=>'010',
'oq'=>'010',
'ora'=>'0100',
'ori'=>'0100',
'oro'=>'0100',
'orrh'=>'00400',
'oru'=>'0100',
'osl'=>'0120',
'ospo'=>'00400',
'ot'=>'010',
'otac'=>'04300',
'otos'=>'00034',
'ou'=>'001',
'oug'=>'0100',
'ov'=>'010',
'oy'=>'001',
'oz'=>'010',
'pa'=>'100',
'pans'=>'00030',
'pe'=>'100',
'ph'=>'100',
'pi'=>'100',
'pl'=>'120',
'pla'=>'0040',
'plop'=>'40300',
'pn'=>'010',
'pna'=>'0210',
'pne'=>'0200',
'po'=>'100',
'pp'=>'210',
'ppia'=>'04300',
'pr'=>'120',
'ps'=>'210',
'psod'=>'04300',
'psy'=>'3200',
'pt'=>'210',
'pu'=>'100',
'pub'=>'0012',
'pue'=>'2010',
'pui'=>'0010',
'pyl'=>'0001',
'pylo'=>'00400',
'qu'=>'002',
'quan'=>'00040',
'ralg'=>'43000',
'raq'=>'2000',
'rarc'=>'43000',
'rb'=>'010',
'rc'=>'010',
'rd'=>'210',
're'=>'100',
'rech'=>'00300',
'regi'=>'00003',
'renn'=>'43000',
'reut'=>'00300',
'rf'=>'010',
'rg'=>'210',
'rhi'=>'1000',
'rhu'=>'0100',
'rhyd'=>'03000',
'rj'=>'010',
'rl'=>'010',
'rm'=>'210',
'rn'=>'010',
'rp'=>'010',
'rq'=>'010',
'rr'=>'010',
'rraq'=>'00300',
'rs'=>'012',
'rt'=>'210',
'rua'=>'0010',
'rue'=>'0010',
'rui'=>'0010',
'rv'=>'010',
'rw'=>'010',
'ryse'=>'00400',
'rz'=>'010',
'sa'=>'100',
'sabu'=>'03000',
'sact'=>'43000',
'saf'=>'2100',
'sagr'=>'03000',
'sann'=>'03000',
'sap'=>'2100',
'saq'=>'2100',
'sarg'=>'03000',
'sarm'=>'03000',
'sart'=>'03000',
'sb'=>'210',
'sc'=>'120',
'scle'=>'00004',
'sd'=>'010',
'se'=>'100',
'sf'=>'210',
'sg'=>'210',
'sh'=>'010',
'si'=>'100',
'sige'=>'43000',
'siro'=>'03000',
'sj'=>'010',
'sk'=>'100',
'sl'=>'010',
'slav'=>'04000',
'sm'=>'210',
'sn'=>'010',
'so'=>'100',
'sob'=>'0002',
'sobe'=>'03000',
'sobl'=>'03000',
'socc'=>'03000',
'sodo'=>'03000',
'sord'=>'03000',
'sorg'=>'03000',
'soss'=>'03000',
'sox'=>'2100',
'sp'=>'010',
'spa'=>'2000',
'spai'=>'00040',
'spl'=>'2000',
'spo'=>'2000',
'sq'=>'010',
'sr'=>'010',
'ss'=>'212',
'ssa'=>'0300',
'st'=>'010',
'su'=>'100',
'sua'=>'0010',
'suba'=>'00400',
'subr'=>'00400',
'sue'=>'0010',
'sui'=>'0010',
'sun'=>'2100',
'sv'=>'210',
'sy'=>'100',
'talg'=>'43000',
'tamb'=>'43000',
'tart'=>'43000',
'td'=>'210',
'teco'=>'43000',
'tf'=>'210',
'tg'=>'210',
'thl'=>'0010',
'thm'=>'2000',
'tisp'=>'00340',
'tl'=>'010',
'tm'=>'210',
'tmo'=>'0001',
'tosp'=>'00340',
'toxy'=>'43000',
'tp'=>'210',
'tr'=>'020',
'tror'=>'40300',
'ts'=>'210',
'tt'=>'210',
'tua'=>'0010',
'tue'=>'0010',
'tui'=>'0010',
'tusa'=>'00430',
'ty'=>'001',
'tz'=>'210',
'uani'=>'03000',
'uas'=>'0100',
'uav'=>'0100',
'ubal'=>'00300',
'ubl'=>'0010',
'ubro'=>'00300',
'uca'=>'0100',
'uce'=>'0100',
'ucem'=>'00300',
'uch'=>'0100',
'uco'=>'0100',
'ucr'=>'0100',
'ucu'=>'0100',
'ud'=>'010',
'uel'=>'0100',
'uib'=>'0100',
'uic'=>'0100',
'ula'=>'0100',
'ule'=>'0100',
'ulo'=>'0100',
'uo'=>'011',
'ura'=>'0100',
'urgo'=>'30000',
'uri'=>'0100',
'uro'=>'0100',
'uru'=>'0100',
'ust'=>'0200',
'ut'=>'010',
'uu'=>'010',
'uv'=>'010',
'vai'=>'0010',
'viru'=>'00300',
'vn'=>'200',
'vr'=>'020',
'wn'=>'021',
'xa'=>'010',
'xc'=>'010',
'xe'=>'010',
'xh'=>'010',
'xi'=>'010',
'xo'=>'010',
'xp'=>'010',
'xq'=>'010',
'xs'=>'012',
'xt'=>'010',
'xu'=>'010',
'xua'=>'0010',
'xy'=>'011',
'xyl'=>'1000',
'ya'=>'010',
'yb'=>'010',
'yca'=>'0010',
'yce'=>'0100',
'ych'=>'0001',
'yco'=>'0100',
'ycta'=>'00430',
'ydr'=>'0001',
'ye'=>'010',
'yg'=>'010',
'yh'=>'010',
'yi'=>'010',
'yl'=>'200',
'ylac'=>'03000',
'ylam'=>'00300',
'yle'=>'0100',
'ylo'=>'0100',
'ynan'=>'00300',
'yneg'=>'00300',
'yo'=>'010',
'ypo'=>'0001',
'ypos'=>'00004',
'yr'=>'010',
'yro'=>'0001',
'yros'=>'00004',
'yse'=>'0010',
'yt'=>'010',
'yu'=>'010',
'yz'=>'010',
'ze'=>'001',
'zi'=>'100',
'zu'=>'101',
'zz'=>'210'
)
);
?>

333
php-typography/lang/id.php Normal file
View File

@ -0,0 +1,333 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-id.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: inhyph.tex (1997-09-19)
% Author: Jörg Knappen <knappen@vkpmzd.kph.uni-mainz.de>, Terry Mart <mart@kph.uni-mainz.de>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
% inhyph.tex
% Version 1.3 19-SEP-1997
%
% Hyphenation patterns for bahasa indonesia (probably also usable
% for bahasa melayu)
%
% (c) Copyright 1996, 1997 Jörg Knappen and Terry Mart
%
% This patterns are free software according to the GNU General Public
% licence version 2, June 1991.
%
% Please read the GNU licence for details. If you don't receive a GNU
% licence with these patterns, you can obtain it from
%
% Free Software Foundation, Inc.
% 675 Mass Ave, Cambridge, MA 02139, USA
%
% If you make any changes to this file, please rename it so that it
% cannot be confused with the original one, and change the contact
% address for bug reports and suggestions.
%
% For bug reports, improvements, and suggestions, contact
%
% Jörg Knappen
% jk Unternehmensberatung
% Barbarossaring 43
% 55118 Mainz
%
% knappen@vkpmzd.kph.uni-mainz.de
%
% or:
% Terry Mart
%
% Institut fuer Kernphysik
% Universitaet Mainz
% 55099 Mainz
% Germany
%
% phone : +49 6131 395174
% fax : +49 6131 395474
% email : mart@kph.uni-mainz.de
%
%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*
%
% The patterns are best used with the following parameters
%
% \lefthyphenmin=2 \righthyphenmin=2 %
%
%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*
//============================================================================================================
*/
$patgenLanguage = 'Indonesian';
$patgenExceptions = array(
'berabe'=>'be-ra-be',
'berahi'=>'be-ra-hi',
'berak'=>'be-rak',
'beranda'=>'be-ran-da',
'berandal'=>'be-ran-dal',
'berang'=>'be-rang',
'berangasan'=>'be-ra-ngas-an',
'berangsang'=>'be-rang-sang',
'berangus'=>'be-ra-ngus',
'berani'=>'be-ra-ni',
'berantakan'=>'be-ran-tak-an',
'berantam'=>'be-ran-tam',
'berantas'=>'be-ran-tas',
'berapa'=>'be-ra-pa',
'beras'=>'be-ras',
'berendeng'=>'be-ren-deng',
'berengut'=>'be-re-ngut',
'bererot'=>'be-re-rot',
'beres'=>'be-res',
'berewok'=>'be-re-wok',
'beri'=>'be-ri',
'beringas'=>'be-ri-ngas',
'berisik'=>'be-ri-sik',
'berita'=>'be-ri-ta',
'berok'=>'be-rok',
'berondong'=>'be-ron-dong',
'berontak'=>'be-ron-tak',
'berudu'=>'be-ru-du',
'beruk'=>'be-ruk',
'beruntun'=>'be-run-tun',
'pengekspor'=>'peng-eks-por',
'pengimpor'=>'peng-im-por',
'tera'=>'te-ra',
'terang'=>'te-rang',
'teras'=>'te-ras',
'terasi'=>'te-ra-si',
'teratai'=>'te-ra-tai',
'terawang'=>'te-ra-wang',
'teraweh'=>'te-ra-weh',
'teriak'=>'te-ri-ak',
'terigu'=>'te-ri-gu',
'terik'=>'te-rik',
'terima'=>'te-ri-ma',
'teripang'=>'te-ri-pang',
'terobos'=>'te-ro-bos',
'terobosan'=>'te-ro-bos-an',
'teromol'=>'te-ro-mol',
'terompah'=>'te-rom-pah',
'terompet'=>'te-rom-pet',
'teropong'=>'te-ro-pong',
'terowongan'=>'te-ro-wong-an',
'terubuk'=>'te-ru-buk',
'teruna'=>'te-ru-na',
'terus'=>'te-rus',
'terusi'=>'te-ru-si'
);
$patgenMaxSeg = 6;
$patgen = array(
'begin'=>array(
'ber'=>'0023',
'ter'=>'0023',
'meng'=>'00203',
'per'=>'0023',
'atau'=>'02020',
'tangan'=>'0030400',
'lengan'=>'0030400',
'jangan'=>'0030400',
'mangan'=>'0030400',
'pangan'=>'0030400',
'ringan'=>'0030400',
'dengan'=>'0030400'
),
'end'=>array(
'ng'=>'200',
'ny'=>'200',
'ban'=>'2100',
'can'=>'2100',
'dan'=>'2100',
'fan'=>'2100',
'gan'=>'2100',
'han'=>'2100',
'jan'=>'2100',
'kan'=>'2100',
'lan'=>'2100',
'man'=>'2100',
'ngan'=>'20100',
'nan'=>'2100',
'pan'=>'2100',
'ran'=>'2100',
'san'=>'2100',
'tan'=>'2100',
'van'=>'2100',
'zan'=>'2100',
'an'=>'300'
),
'all'=>array(
'ck'=>'210',
'cn'=>'210',
'dk'=>'210',
'dn'=>'210',
'dp'=>'210',
'fd'=>'210',
'fk'=>'210',
'fn'=>'210',
'ft'=>'210',
'gg'=>'210',
'gk'=>'210',
'gn'=>'210',
'hk'=>'210',
'hl'=>'210',
'hm'=>'210',
'hn'=>'210',
'hw'=>'210',
'jk'=>'210',
'jn'=>'210',
'kb'=>'210',
'kk'=>'210',
'km'=>'210',
'kn'=>'210',
'kr'=>'210',
'ks'=>'210',
'kt'=>'210',
'lb'=>'210',
'lf'=>'210',
'lg'=>'210',
'lh'=>'210',
'lk'=>'210',
'lm'=>'210',
'ln'=>'210',
'ls'=>'210',
'lt'=>'210',
'lq'=>'210',
'mb'=>'210',
'mk'=>'210',
'ml'=>'210',
'mm'=>'210',
'mn'=>'210',
'mp'=>'210',
'mr'=>'210',
'ms'=>'210',
'nc'=>'210',
'nd'=>'210',
'nf'=>'210',
'nj'=>'210',
'nk'=>'210',
'nn'=>'210',
'np'=>'210',
'ns'=>'210',
'nt'=>'210',
'nv'=>'210',
'pk'=>'210',
'pn'=>'210',
'pp'=>'210',
'pr'=>'210',
'pt'=>'210',
'rb'=>'210',
'rc'=>'210',
'rf'=>'210',
'rg'=>'210',
'rh'=>'210',
'rj'=>'210',
'rk'=>'210',
'rl'=>'210',
'rm'=>'210',
'rn'=>'210',
'rp'=>'210',
'rr'=>'210',
'rs'=>'210',
'rt'=>'210',
'rw'=>'210',
'ry'=>'210',
'sb'=>'210',
'sk'=>'210',
'sl'=>'210',
'sm'=>'210',
'sn'=>'210',
'sp'=>'210',
'sr'=>'210',
'ss'=>'210',
'st'=>'210',
'sw'=>'210',
'tk'=>'210',
'tl'=>'210',
'tn'=>'210',
'tt'=>'210',
'wt'=>'210',
'ngg'=>'2010',
'ngh'=>'2010',
'ngk'=>'2010',
'ngn'=>'2010',
'ngs'=>'2010',
'nst'=>'2320',
'ion'=>'0210',
'air'=>'0200',
'bagai'=>'101020'
)
);
?>

4299
php-typography/lang/is.php Normal file

File diff suppressed because it is too large Load Diff

518
php-typography/lang/it.php Normal file
View File

@ -0,0 +1,518 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-it.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: ithyph.tex (2008-03-08)
% Author: Claudio Beccari <claudio.beccari at polito.it>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
%%%%%%%%%%%%%%%%%%%%%%%%%%% file ithyph.tex %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% Prepared by Claudio Beccari e-mail claudio.beccari@polito.it
%
% Dipartimento di Elettronica
% Politecnico di Torino
% Corso Duca degli Abruzzi, 24
% 10129 TORINO
%
% Copyright 1998, 2008 Claudio Beccari
%
% This program is free software; it can be redistributed and/or modified
% under the terms of the GNU Lesser General Public Licence,
% as published by the Free Software Foundation, either version 2.1 of the
% Licence or (at your option) any later version.
%
% \versionnumber{4.8g} \versiondate{2008/03/08}
%
% These hyphenation patterns for the Italian language are supposed to comply
% with the Recommendation UNI 6461 on hyphenation issued by the Italian
% Standards Institution (Ente Nazionale di Unificazione UNI). No guarantee
% or declaration of fitness to any particular purpose is given and any
% liability is disclaimed.
%
% See comments at the end of the file after the \endinput line
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Information %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% As the previous versions, this new set of patterns does not contain any
% accented character so that the hyphenation algorithm behaves properly in
% both cases, that is with OT1 and T1 encodings. With the former encoding
% fonts do not contain accented characters, while with the latter accented
% characters are present and sequences such as à map directly to slot 'E0 that
% contains 'agrave'.
%
% Of course if you use T1 encoded fonts you get the full power of the hyphen-
% ation algorithm, while if you use OT1 encoded fonts you miss some possible
% break points; this is not a big inconvenience in Italian because:
%
% 1) The Regulation UNI 6015 on accents specifies that compulsory accents
% appear only on the ending vowel of oxitone words (parole tronche); this
% means that it is almost indifferent to have or to miss the T1 encoded
% fonts because the only difference consists in how TeX evaluates the end
% of the word; in practice if you have these special facilities you get
% 'qua-li-tà', while if you miss them, you get 'qua-lità' (assuming that
% \righthyphenmin > 1).
%
% 2) Optional accents are so rare in Italian, that if you absolutely want to
% use them in those rare instances, and you miss the T1 encoding
% facilities, you should also provide explicit discretionary hyphens as in
% 'sé\-gui\-to'.
%
% There is no explicit hyphenation exception list because these patterns
% proved to hyphenate correctly a very large set of words suitably chosen in
% order to test them in the most heavy circumstances; these patterns were used
% in the preparation of a number of books and no errors were discovered.
%
% Nevertheless if you frequently use technical terms that you want hyphenated
% differently from what is normally done (for example if you prefer
% etymological hyphenation of prefixed and/or suffixed words) you should
% insert a specific hyphenation list in the preamble of your document, for
% example:
%
% \hyphenation{su-per-in-dut-to-re su-per-in-dut-to-ri}
%
% If you use, as you should, the italan option of the babel package, then you
% have available the active charater ' that allows you to put a discretionary
% break at a word boundary of a compound word while maintaning the hyphenation
% algorithm on the rest of the word.
%
% Please, read the babel package documentation.
%
% Should you find any word that gets hyphenated in a wrong way, please, AFTER
% CHECKING ON A RELIABLE MODERN DICTIONARY, report to the author, preferably
% by e-mail.
%
%
% Happy multilingual typesetting!
//============================================================================================================
*/
$patgenLanguage = 'Italian';
$patgenExceptions = array();
$patgenMaxSeg = 7;
$patgen = array(
'begin'=>array(
'apn'=>'0320',
'anti'=>'00001',
'antimn'=>'0000320',
'bio'=>'0001',
'caps'=>'00430',
'circum'=>'0000021',
'contro'=>'0000001',
'discine'=>'00230000',
'exeu'=>'02100',
'frank'=>'000023',
'free'=>'00003',
'lipsa'=>'003200',
'narco'=>'000001',
'opto'=>'00001',
'ortop'=>'000032',
'para'=>'00001',
'polip'=>'000032',
'pre'=>'0001',
'ps'=>'020',
'reiscr'=>'0012000',
'share'=>'000203',
'transc'=>'0000230',
'transd'=>'0000230',
'transl'=>'0000230',
'transn'=>'0000230',
'transp'=>'0000230',
'transr'=>'0000230',
'transt'=>'0000230',
'sublu'=>'002300',
'subr'=>'00230',
'wagn'=>'00230',
'welt'=>'00021',
'c'=>'02',
'd'=>'02',
'z'=>'02'
),
'end'=>
array(
'at'=>'200',
'b'=>'20',
'c'=>'20',
'd'=>'20',
'f'=>'20',
'g'=>'20',
'h'=>'20',
'j'=>'20',
'k'=>'20',
'l'=>'20',
'l\''=>'200',
'm'=>'20',
'n'=>'20',
'p'=>'20',
'q'=>'20',
'r'=>'20',
'sh'=>'200',
's'=>'40',
's\''=>'400',
't'=>'20',
't\''=>'200',
'v'=>'20',
'v\''=>'200',
'w'=>'20',
'x'=>'20',
'z'=>'20',
'z\''=>'200'
),
'all'=>array(
'\''=>'22',
'aia'=>'0100',
'aie'=>'0100',
'aio'=>'0100',
'aiu'=>'0100',
'auo'=>'0100',
'aya'=>'0100',
'eiu'=>'0100',
'ew'=>'020',
'oia'=>'0100',
'oie'=>'0100',
'oio'=>'0100',
'oiu'=>'0100',
'b'=>'10',
'bb'=>'200',
'bc'=>'200',
'bd'=>'200',
'bf'=>'200',
'bm'=>'200',
'bn'=>'200',
'bp'=>'200',
'bs'=>'200',
'bt'=>'200',
'bv'=>'200',
'bl'=>'020',
'br'=>'020',
'b\''=>'200',
'c'=>'10',
'cb'=>'200',
'cc'=>'200',
'cd'=>'200',
'cf'=>'200',
'ck'=>'200',
'cm'=>'200',
'cn'=>'200',
'cq'=>'200',
'cs'=>'200',
'ct'=>'200',
'cz'=>'200',
'chh'=>'2000',
'ch'=>'020',
'chb'=>'2000',
'chr'=>'0020',
'chn'=>'2000',
'cl'=>'020',
'cr'=>'020',
'c\''=>'200',
'd'=>'10',
'db'=>'200',
'dd'=>'200',
'dg'=>'200',
'dl'=>'200',
'dm'=>'200',
'dn'=>'200',
'dp'=>'200',
'dr'=>'020',
'ds'=>'200',
'dt'=>'200',
'dv'=>'200',
'dw'=>'200',
'd\''=>'200',
'f'=>'10',
'fb'=>'200',
'fg'=>'200',
'ff'=>'200',
'fn'=>'200',
'fl'=>'020',
'fr'=>'020',
'fs'=>'200',
'ft'=>'200',
'f\''=>'200',
'g'=>'10',
'gb'=>'200',
'gd'=>'200',
'gf'=>'200',
'gg'=>'200',
'gh'=>'020',
'gl'=>'020',
'gm'=>'200',
'gn'=>'020',
'gp'=>'200',
'gr'=>'020',
'gs'=>'200',
'gt'=>'200',
'gv'=>'200',
'gw'=>'200',
'gz'=>'200',
'ght'=>'2020',
'g\''=>'200',
'h'=>'10',
'hb'=>'200',
'hd'=>'200',
'hh'=>'200',
'hipn'=>'00320',
'hl'=>'020',
'hm'=>'200',
'hn'=>'200',
'hr'=>'200',
'hv'=>'200',
'h\''=>'200',
'j'=>'10',
'j\''=>'200',
'k'=>'10',
'kg'=>'200',
'kf'=>'200',
'kh'=>'020',
'kk'=>'200',
'kl'=>'020',
'km'=>'200',
'kr'=>'020',
'ks'=>'200',
'kt'=>'200',
'k\''=>'200',
'l'=>'10',
'lb'=>'200',
'lc'=>'200',
'ld'=>'200',
'lf'=>'232',
'lg'=>'200',
'lh'=>'020',
'lk'=>'200',
'll'=>'200',
'lm'=>'200',
'ln'=>'200',
'lp'=>'200',
'lq'=>'200',
'lr'=>'200',
'ls'=>'200',
'lt'=>'200',
'lv'=>'200',
'lw'=>'200',
'lz'=>'200',
'l\'\''=>'2000',
'm'=>'10',
'mb'=>'200',
'mc'=>'200',
'mf'=>'200',
'ml'=>'200',
'mm'=>'200',
'mn'=>'200',
'mp'=>'200',
'mq'=>'200',
'mr'=>'200',
'ms'=>'200',
'mt'=>'200',
'mv'=>'200',
'mw'=>'200',
'm\''=>'200',
'n'=>'10',
'nb'=>'200',
'nc'=>'200',
'nd'=>'200',
'nf'=>'200',
'ng'=>'200',
'nk'=>'200',
'nl'=>'200',
'nm'=>'200',
'nn'=>'200',
'np'=>'200',
'nq'=>'200',
'nr'=>'200',
'ns'=>'200',
'nsfer'=>'023000',
'nt'=>'200',
'nv'=>'200',
'nz'=>'200',
'ngn'=>'0230',
'nheit'=>'200000',
'n\''=>'200',
'p'=>'10',
'pd'=>'200',
'ph'=>'020',
'pl'=>'020',
'pn'=>'200',
'pne'=>'3200',
'pp'=>'200',
'pr'=>'020',
'ps'=>'200',
'psic'=>'32000',
'pt'=>'200',
'pz'=>'200',
'p\''=>'200',
'q'=>'10',
'qq'=>'200',
'q\''=>'200',
'r'=>'10',
'rb'=>'200',
'rc'=>'200',
'rd'=>'200',
'rf'=>'200',
'rh'=>'020',
'rg'=>'200',
'rk'=>'200',
'rl'=>'200',
'rm'=>'200',
'rn'=>'200',
'rp'=>'200',
'rq'=>'200',
'rr'=>'200',
'rs'=>'200',
'rt'=>'200',
'rts'=>'0223',
'rv'=>'200',
'rx'=>'200',
'rw'=>'200',
'rz'=>'200',
'r\''=>'200',
's'=>'12',
'shm'=>'2000',
'sh\''=>'2000',
'ss'=>'230',
'ssm'=>'0430',
'spn'=>'2320',
'stb'=>'2000',
'stc'=>'2000',
'std'=>'2000',
'stf'=>'2000',
'stg'=>'2000',
'stm'=>'2000',
'stn'=>'2000',
'stp'=>'2000',
'sts'=>'2000',
'stt'=>'2000',
'stv'=>'2000',
'sz'=>'200',
's\'\''=>'4000',
't'=>'10',
'tb'=>'200',
'tc'=>'200',
'td'=>'200',
'tf'=>'200',
'tg'=>'200',
'th'=>'020',
'tl'=>'020',
'tm'=>'200',
'tn'=>'200',
'tp'=>'200',
'tr'=>'020',
'ts'=>'020',
'tsch'=>'32000',
'tt'=>'200',
'tts'=>'0230',
'tv'=>'200',
'tw'=>'200',
'tz'=>'020',
'tzk'=>'2000',
'tzs'=>'0020',
't\'\''=>'2000',
'v'=>'10',
'vc'=>'200',
'vl'=>'020',
'vr'=>'020',
'vv'=>'200',
'v\'\''=>'2000',
'w'=>'10',
'wh'=>'020',
'war'=>'0020',
'wy'=>'210',
'w\''=>'200',
'x'=>'10',
'xb'=>'200',
'xc'=>'200',
'xf'=>'200',
'xh'=>'200',
'xm'=>'200',
'xp'=>'200',
'xt'=>'200',
'xw'=>'200',
'x\''=>'200',
'you'=>'0100',
'yi'=>'010',
'z'=>'10',
'zb'=>'200',
'zd'=>'200',
'zl'=>'200',
'zn'=>'200',
'zp'=>'200',
'zt'=>'200',
'zs'=>'200',
'zv'=>'200',
'zz'=>'200',
'z\'\''=>'2000'
)
);
?>

545
php-typography/lang/la.php Normal file
View File

@ -0,0 +1,545 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-la.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: lahyph.tex (2007-09-03)
% Author: Claudio Beccari <claudio.beccari at polito.it>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
% ********** lahyph.tex *************
%
% Copyright 1999- 2001 Claudio Beccari
% [latin hyphenation patterns]
%
% -----------------------------------------------------------------
% IMPORTANT NOTICE:
%
% This program can be redistributed and/or modified under the terms
% of the LaTeX Project Public License Distributed from CTAN
% archives in directory macros/latex/base/lppl.txt; either
% version 1 of the License, or any later version.
% -----------------------------------------------------------------
%
% Patterns for the latin language mainly in modern spelling
% (u when u is needed and v when v is needed); medieval spelling
% with the ligatures \ae and \oe and the (uncial) lowercase `v'
% written as a `u' is also supported; apparently there is no conflict
% between the patterns of modern Latin and those of medieval Latin.
%
% Support for font encoding OT1 with 128-character set and
% for font encoding T1 with a 256-character set.
%
% Prepared by Claudio Beccari
% Politecnico di Torino
% Torino, Italy
% e-mail beccari@polito.it
%
% 1999/03/10 Integration of `lahyph7.tex' and `lahyph8.tex' into
% one file `lahyph.tex' supporting fonts in OT1 and T1 encoding by
% Bernd Raichle using the macro code from `dehypht.tex' (this code
% is Copyright 1993,1994,1998,1999 Bernd Raichle/DANTE e.V.).
%
%
% \versionnumber{3.1} \versiondate{2007/04/16}
%
% Information after \endinput.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% \message{Latin Hyphenation Patterns `lahyph' Version 3.1 <2007/04/16>}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% For documentation see:
% C. Beccari, 'Computer aided hyphenation for Italian and Modern
% Latin', TUG vol. 13, n. 1, pp. 23-33 (1992)
%
% see also
%
% C. Beccari, 'Typesetting of ancient languages',
% TUG vol.15, n.1, pp. 9-16 (1994)
%
% In the former paper the code was described as being contained in file
% ITALAT.TEX; this is substantially the same code, but the file has been
% renamed LAHYPH.TEX in accordance with the ISO name for Latin and the
% convention that all hyphenation pattern file names should be formed by the
% agglutination of two letter language ISO code and the abbreviation HYPH.
%
% A corresponding file (ITHYPH.TEX) has been extracted in order to eliminate
% the (few) patterns specific to Latin and leave those specific to Italian;
% ITHYPH.TEX has been further extended with many new patterns in order to
% cope with the many neologisms and technical terms with foreign roots.
%
% Should you find any word that gets hyphenated in a wrong way, please, AFTER
% CHECKING ON A RELIABLE MODERN DICTIONARY, report to the author, preferably
% by e-mail. Please do not report about wrong break points concerning
% prefixes and/or suffixes; see at the bottom of this file.
%
% Compared with the previous versions, this file has been extended so as to
% cope also with the medieval Latin spelling, where the letter `V' played the
% roles of both `U' and `V', as in the Roman times, save that the Romans used
% only capitals. In the middle ages the availability of soft writing supports
% and the necessity of copying books with a reasonable speed, several scripts
% evolved in (practically) all of which there was a lower case alphabet
% different from the upper case one, and where the lower case `v' had the
% rounded shape of our modern lower case `u', and where the Latin diphthongs
% `AE' and `OE', both in upper and lower case, where written as ligatures,
% not to mention the habit of substituting them with their sound, that is a
% simple `E'.
%
% According to Leon Battista Alberti, who in 1466 wrote a book on
% cryptography where he thoroughly analyzed the hyphenation of the Latin
% language of his (still medieval) times, the differences from the Tuscan
% language (the Italian language, as it was named at his time) were very
% limited, in particular for what concerns the handling of the ascending and
% descending diphthongs; in Central and Northern Europe, and later on in
% North America, the Scholars perceived the above diphthongs as made of two
% distinct vowels; the hyphenation of medieval Latin, therefore, was quite
% different in the northern countries compared to the southern ones, at least
% for what concerns these diphthongs. If you need hyphenation patterns for
% medieval Latin that suite you better according to the habits of Northern
% Europe you should resort to the hyphenation patterns prepared by Yannis
% Haralambous (TUGboat, vol.13 n.4 (1992)).
%
%
%
% PREFIXES AND SUFFIXES
%
% For what concerns prefixes and suffixes, the latter are generally separated
% according to 'natural' syllabification, while the former are generally
% divided etimologically. In order to avoid an excessive number of patterns,
% care has been paid to some prefixes, especially 'ex', 'trans', 'circum',
% 'prae', but this set of patterns is NOT capable of separating the prefixes
% in all circumstances.
%
% BABEL SHORTCUTS AND FACILITIES
%
% Read the documentation coming with the discription of the Latin language
% interface of Babel in order to see the shortcuts and the facilities
% introduced in order to facilitate the insertion of 'compound word marks'
% which are very useful for inserting etimological break points.
%
% Happy Latin and multilingual typesetting!
//============================================================================================================
*/
$patgenLanguage = 'Latin';
$patgenExceptions = array();
$patgenMaxSeg = 7;
$patgen = array(
'begin'=>array(
'abl'=>'0230',
'anti'=>'00001',
'antimn'=>'0000320',
'circum'=>'0000021',
'coniun'=>'0021000',
'discine'=>'00230000',
'ex'=>'021',
'ob'=>'023',
'parai'=>'000010',
'parau'=>'000010',
'sublu'=>'002300',
'subr'=>'00230'
),
'end'=>array(
'sque'=>'23000',
'sdem'=>'23000',
'b'=>'20',
'c'=>'20',
'd'=>'20',
'f'=>'20',
'g'=>'20',
'h'=>'20',
'l'=>'20',
'm'=>'20',
'n'=>'20',
'p'=>'20',
'r'=>'20',
's'=>'20',
'st'=>'200',
't'=>'20',
'x'=>'20',
'z'=>'20'
),
'all'=>array(
'\''=>'22',
'psic'=>'32000',
'pneu'=>'32000',
'æ'=>'01',
'œ'=>'01',
'aia'=>'0100',
'aie'=>'0100',
'aio'=>'0100',
'aiu'=>'0100',
'aea'=>'0010',
'aeo'=>'0010',
'aeu'=>'0010',
'eiu'=>'0100',
'ioi'=>'0010',
'oia'=>'0100',
'oie'=>'0100',
'oio'=>'0100',
'oiu'=>'0100',
'uou'=>'0030',
'b'=>'10',
'bb'=>'200',
'bd'=>'200',
'bl'=>'020',
'bm'=>'200',
'bn'=>'200',
'br'=>'020',
'bt'=>'200',
'bs'=>'200',
'c'=>'10',
'cc'=>'200',
'ch'=>'022',
'cl'=>'020',
'cm'=>'200',
'cn'=>'200',
'cq'=>'200',
'cr'=>'020',
'cs'=>'200',
'ct'=>'200',
'cz'=>'200',
'd'=>'10',
'dd'=>'200',
'dg'=>'200',
'dm'=>'200',
'dr'=>'020',
'ds'=>'200',
'dv'=>'200',
'f'=>'10',
'ff'=>'200',
'fl'=>'020',
'fn'=>'200',
'fr'=>'020',
'ft'=>'200',
'g'=>'10',
'gg'=>'200',
'gd'=>'200',
'gf'=>'200',
'gl'=>'020',
'gm'=>'200',
'gn'=>'020',
'gr'=>'020',
'gs'=>'200',
'gv'=>'200',
'h'=>'10',
'hp'=>'200',
'ht'=>'200',
'j'=>'10',
'k'=>'10',
'kk'=>'200',
'kh'=>'022',
'l'=>'10',
'lb'=>'200',
'lc'=>'200',
'ld'=>'200',
'lf'=>'200',
'lft'=>'0320',
'lg'=>'200',
'lk'=>'200',
'll'=>'200',
'lm'=>'200',
'ln'=>'200',
'lp'=>'200',
'lq'=>'200',
'lr'=>'200',
'ls'=>'200',
'lt'=>'200',
'lv'=>'200',
'm'=>'10',
'mm'=>'200',
'mb'=>'200',
'mp'=>'200',
'ml'=>'200',
'mn'=>'200',
'mq'=>'200',
'mr'=>'200',
'mv'=>'200',
'n'=>'10',
'nb'=>'200',
'nc'=>'200',
'nd'=>'200',
'nf'=>'200',
'ng'=>'200',
'nl'=>'200',
'nm'=>'200',
'nn'=>'200',
'np'=>'200',
'nq'=>'200',
'nr'=>'200',
'ns'=>'200',
'nsm'=>'0230',
'nsf'=>'0230',
'nt'=>'200',
'nv'=>'200',
'nx'=>'200',
'p'=>'10',
'ph'=>'020',
'pl'=>'020',
'pn'=>'200',
'pp'=>'200',
'pr'=>'020',
'ps'=>'200',
'pt'=>'200',
'pz'=>'200',
'php'=>'2000',
'pht'=>'2000',
'qu'=>'102',
'r'=>'10',
'rb'=>'200',
'rc'=>'200',
'rd'=>'200',
'rf'=>'200',
'rg'=>'200',
'rh'=>'020',
'rl'=>'200',
'rm'=>'200',
'rn'=>'200',
'rp'=>'200',
'rq'=>'200',
'rr'=>'200',
'rs'=>'200',
'rt'=>'200',
'rv'=>'200',
'rz'=>'200',
's'=>'12',
'sph'=>'2300',
'ss'=>'230',
'stb'=>'2000',
'stc'=>'2000',
'std'=>'2000',
'stf'=>'2000',
'stg'=>'2000',
'stl'=>'2030',
'stm'=>'2000',
'stn'=>'2000',
'stp'=>'2000',
'stq'=>'2000',
'sts'=>'2000',
'stt'=>'2000',
'stv'=>'2000',
't'=>'10',
'tb'=>'200',
'tc'=>'200',
'td'=>'200',
'tf'=>'200',
'tg'=>'200',
'th'=>'020',
'tl'=>'020',
'tr'=>'020',
'tm'=>'200',
'tn'=>'200',
'tp'=>'200',
'tq'=>'200',
'tt'=>'200',
'tv'=>'200',
'v'=>'10',
'vl'=>'020',
'vr'=>'020',
'vv'=>'200',
'x'=>'10',
'xt'=>'200',
'xx'=>'200',
'z'=>'10',
'aua'=>'0100',
'aue'=>'0100',
'aui'=>'0100',
'auo'=>'0100',
'auu'=>'0100',
'eua'=>'0100',
'eue'=>'0100',
'eui'=>'0100',
'euo'=>'0100',
'euu'=>'0100',
'iua'=>'0100',
'iue'=>'0100',
'iui'=>'0100',
'iuo'=>'0100',
'iuu'=>'0100',
'oua'=>'0100',
'oue'=>'0100',
'oui'=>'0100',
'ouo'=>'0100',
'ouu'=>'0100',
'uua'=>'0100',
'uue'=>'0100',
'uui'=>'0100',
'uuo'=>'0100',
'uuu'=>'0100',
'alua'=>'02100',
'alue'=>'02100',
'alui'=>'02100',
'aluo'=>'02100',
'aluu'=>'02100',
'elua'=>'02100',
'elue'=>'02100',
'elui'=>'02100',
'eluo'=>'02100',
'eluu'=>'02100',
'ilua'=>'02100',
'ilue'=>'02100',
'ilui'=>'02100',
'iluo'=>'02100',
'iluu'=>'02100',
'olua'=>'02100',
'olue'=>'02100',
'olui'=>'02100',
'oluo'=>'02100',
'oluu'=>'02100',
'ulua'=>'02100',
'ulue'=>'02100',
'ului'=>'02100',
'uluo'=>'02100',
'uluu'=>'02100',
'amua'=>'02100',
'amue'=>'02100',
'amui'=>'02100',
'amuo'=>'02100',
'amuu'=>'02100',
'emua'=>'02100',
'emue'=>'02100',
'emui'=>'02100',
'emuo'=>'02100',
'emuu'=>'02100',
'imua'=>'02100',
'imue'=>'02100',
'imui'=>'02100',
'imuo'=>'02100',
'imuu'=>'02100',
'omua'=>'02100',
'omue'=>'02100',
'omui'=>'02100',
'omuo'=>'02100',
'omuu'=>'02100',
'umua'=>'02100',
'umue'=>'02100',
'umui'=>'02100',
'umuo'=>'02100',
'umuu'=>'02100',
'anua'=>'02100',
'anue'=>'02100',
'anui'=>'02100',
'anuo'=>'02100',
'anuu'=>'02100',
'enua'=>'02100',
'enue'=>'02100',
'enui'=>'02100',
'enuo'=>'02100',
'enuu'=>'02100',
'inua'=>'02100',
'inue'=>'02100',
'inui'=>'02100',
'inuo'=>'02100',
'inuu'=>'02100',
'onua'=>'02100',
'onue'=>'02100',
'onui'=>'02100',
'onuo'=>'02100',
'onuu'=>'02100',
'unua'=>'02100',
'unue'=>'02100',
'unui'=>'02100',
'unuo'=>'02100',
'unuu'=>'02100',
'arua'=>'02100',
'arue'=>'02100',
'arui'=>'02100',
'aruo'=>'02100',
'aruu'=>'02100',
'erua'=>'02100',
'erue'=>'02100',
'erui'=>'02100',
'eruo'=>'02100',
'eruu'=>'02100',
'irua'=>'02100',
'irue'=>'02100',
'irui'=>'02100',
'iruo'=>'02100',
'iruu'=>'02100',
'orua'=>'02100',
'orue'=>'02100',
'orui'=>'02100',
'oruo'=>'02100',
'oruu'=>'02100',
'urua'=>'02100',
'urue'=>'02100',
'urui'=>'02100',
'uruo'=>'02100',
'uruu'=>'02100'
)
);
?>

1615
php-typography/lang/lt.php Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,663 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-mn-cyrl.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: mnhyphen.tex (2002-06-30)
% Author: Oliver Corff, Dorjpalam Dorj
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% File: mnhyphen.tex
% Author: Oliver Corff and Dorjpalam Dorj
% Date: February 26th, 1999 % mls.sty prevails
% Version: \VersionRelease % see mls.sty!
% Copyright: Ulaanbaatar, Beijing, Berlin
%
% Description: The Mongolian Hyphenation Pattern File
% to be used together with LMC encoding.
% Hyphenation exceptions should be stored
% in mnhyphex.tex.
%
% It may well be possible that the hyphenation
% patterns given below are incomplete or plainly
% wrong. It should also be mentioned that TeX
% sometimes ignores correct hyphenation information
% and makes up its own mind. Anyway, please con-
% sider all hyphenation data strictly experimental
% and *not yet stable*.
%
% This file is mostly based on Cäwäl's Mongol
% Xälniï Towq Taïlbar Tol' (MXTTT for short;
% ``Short Explanatory Dictionary of Mongolian)
% but contains a few other sources as well.
%
% Comments, corrections and suggestions are
% highly appreciated and should be directed to
% the authors at corff@zedat.fu-berlin.de
%
% U/B/B, February 1999
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% ------------------- identification -------------------
%
% \message{mnhyphen.tex - Hyphenation Patterns for
% Xalx Mongolian, LMC Encoding}
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% The following code is closely modelled after russian.sty and
% its accompanying hyphenation file.
%
% We first must make some of the non-ASCII range characters known
% as characters to TeX, and include case mapping information.
//============================================================================================================
*/
$patgenLanguage = 'Mongolian (Cyrillic)';
$patgenExceptions = array();
$patgenMaxSeg = 6;
$patgen = array(
'begin'=>array(
'аа'=>'002',
'ин'=>'002',
'оё'=>'002',
'оо'=>'002',
'өө'=>'002',
'уу'=>'002',
'үү'=>'002',
'ээ'=>'002'
),
'end'=>array(
'дүү'=>'3000'
),
'all'=>array(
'ааж'=>'0020',
'ад'=>'010',
'ади'=>'0200',
'айб'=>'0020',
'ап'=>'010',
'асаа'=>'00003',
'ат'=>'010',
'аф'=>'010',
'ах'=>'010',
'ац'=>'010',
'ацд'=>'0320',
'ач'=>'010',
'аш'=>'010',
'аю'=>'010',
'аяал'=>'02000',
'ба'=>'100',
'байду'=>'000200',
'бами'=>'00300',
'бг'=>'210',
'би'=>'100',
'бл'=>'200',
'бр'=>'210',
'буж'=>'0030',
'ва'=>'100',
'вб'=>'010',
'вг'=>'210',
'вд'=>'210',
'ве'=>'100',
'вед'=>'0020',
'вж'=>'010',
'вз'=>'010',
'ви'=>'100',
'вл'=>'210',
'вн'=>'210',
'во'=>'100',
'вө'=>'100',
'вр'=>'210',
'вс'=>'010',
'вт'=>'210',
'ву'=>'100',
'вү'=>'100',
'вц'=>'010',
'вш'=>'210',
'вэ'=>'100',
'вя'=>'100',
'га'=>'100',
'гб'=>'010',
'гв'=>'010',
'гг'=>'210',
'гд'=>'210',
'гж'=>'210',
'ги'=>'100',
'гл'=>'210',
'гм'=>'210',
'гн'=>'210',
'го'=>'100',
'годи'=>'00200',
'гө'=>'100',
'гр'=>'010',
'грам'=>'02000',
'гре'=>'3000',
'гс'=>'200',
'гт'=>'210',
'гу'=>'100',
'гуулиу'=>'0000002',
'гү'=>'100',
'гх'=>'010',
'гц'=>'010',
'гч'=>'210',
'гш'=>'210',
'гши'=>'0300',
'гы'=>'100',
'гэ'=>'100',
'гэнү'=>'00200',
'гял'=>'0003',
'давы'=>'00100',
'дб'=>'010',
'дв'=>'210',
'дг'=>'210',
'дд'=>'210',
'дек'=>'2000',
'дж'=>'210',
'диа'=>'0001',
'дит'=>'2000',
'дл'=>'210',
'дм'=>'210',
'дн'=>'210',
'др'=>'210',
'дс'=>'210',
'дт'=>'210',
'дх'=>'210',
'дц'=>'210',
'дч'=>'210',
'дъ'=>'200',
'дı'=>'200',
'еб'=>'010',
'ев'=>'010',
'ег'=>'012',
'ед'=>'010',
'ез'=>'010',
'еи'=>'010',
'ел'=>'010',
'ем'=>'010',
'ео'=>'001',
'еп'=>'010',
'ере'=>'0100',
'етру'=>'03200',
'ех'=>'010',
'ец'=>'010',
'еци'=>'0001',
'еш'=>'010',
'ёд'=>'010',
'ёз'=>'010',
'ёоч'=>'0020',
'ёх'=>'002',
'жа'=>'100',
'жв'=>'210',
'жг'=>'210',
'жд'=>'210',
'жж'=>'210',
'жи'=>'100',
'жиг'=>'3000',
'жин'=>'3000',
'жл'=>'210',
'жм'=>'210',
'жн'=>'210',
'жө'=>'100',
'жр'=>'210',
'жс'=>'210',
'жт'=>'210',
'жу'=>'100',
'жү'=>'100',
'жы'=>'100',
'жэ'=>'100',
'за'=>'100',
'зв'=>'210',
'зг'=>'210',
'зд'=>'210',
'зж'=>'210',
'зи'=>'100',
'зл'=>'210',
'зм'=>'210',
'зн'=>'210',
'зо'=>'100',
'зө'=>'100',
'зр'=>'210',
'зс'=>'210',
'зт'=>'210',
'зу'=>'100',
'зү'=>'100',
'зх'=>'010',
'зц'=>'010',
'зч'=>'010',
'зш'=>'210',
'зы'=>'100',
'зı'=>'200',
'зэ'=>'100',
'игра'=>'00200',
'ид'=>'010',
'идал'=>'03000',
'иды'=>'0200',
'иж'=>'010',
'из'=>'010',
'илди'=>'00200',
'исп'=>'0030',
'ит'=>'010',
'их'=>'010',
'иц'=>'010',
'иш'=>'010',
'йб'=>'010',
'йв'=>'010',
'йг'=>'010',
'йгр'=>'0200',
'йд'=>'010',
'йж'=>'010',
'йп'=>'010',
'йпл'=>'0200',
'йр'=>'010',
'йс'=>'010',
'йт'=>'010',
'йх'=>'010',
'йц'=>'010',
'йч'=>'010',
'ка'=>'100',
'ке'=>'100',
'кж'=>'010',
'ки'=>'100',
'кк'=>'010',
'кл'=>'010',
'кн'=>'010',
'коо'=>'0010',
'ксп'=>'0030',
'кт'=>'010',
'ку'=>'100',
'кц'=>'210',
'кэ'=>'100',
'ла'=>'100',
'лб'=>'210',
'лв'=>'010',
'лг'=>'210',
'лд'=>'210',
'ле'=>'100',
'лж'=>'210',
'лз'=>'210',
'ли'=>'100',
'лл'=>'210',
'лли'=>'0001',
'лм'=>'210',
'лн'=>'200',
'ло'=>'100',
'лод'=>'0020',
'лө'=>'100',
'лр'=>'210',
'лс'=>'210',
'лт'=>'210',
'лу'=>'100',
'лү'=>'100',
'лх'=>'210',
'лц'=>'210',
'лч'=>'200',
'лш'=>'200',
'лъ'=>'200',
'лы'=>'100',
'лı'=>'200',
'лэ'=>'100',
'лю'=>'010',
'ма'=>'100',
'мб'=>'010',
'мг'=>'010',
'мд'=>'010',
'ме'=>'100',
'ми'=>'100',
'мин'=>'2000',
'мк'=>'012',
'мл'=>'010',
'мн'=>'010',
'мо'=>'100',
'мө'=>'100',
'мп'=>'210',
'мр'=>'010',
'му'=>'100',
'мү'=>'100',
'мф'=>'010',
'мх'=>'010',
'мц'=>'010',
'мш'=>'010',
'мы'=>'100',
'мэ'=>'100',
'на'=>'100',
'нб'=>'010',
'нв'=>'010',
'нг'=>'010',
'нгр'=>'0200',
'нгре'=>'00200',
'нд'=>'010',
'нёврл'=>'100000',
'ни'=>'100',
'нк'=>'010',
'нл'=>'010',
'нм'=>'010',
'но'=>'100',
'нө'=>'100',
'нп'=>'010',
'нс'=>'010',
'нсд'=>'0320',
'нт'=>'010',
'ну'=>'100',
'нү'=>'100',
'нх'=>'010',
'нц'=>'010',
'ны'=>'100',
'нэ'=>'100',
'ня'=>'010',
'оа'=>'010',
'об'=>'010',
'огр'=>'0120',
'од'=>'010',
'ое'=>'010',
'ож'=>'010',
'оне'=>'0100',
'онст'=>'00300',
'онт'=>'0030',
'оп'=>'012',
'опе'=>'0200',
'осп'=>'0100',
'от'=>'010',
'оф'=>'010',
'ох'=>'010',
'оц'=>'010',
'оэ'=>'010',
'өд'=>'010',
'өж'=>'010',
'өри'=>'0200',
'өх'=>'010',
'өц'=>'010',
'өч'=>'010',
'пд'=>'210',
'по'=>'001',
'пос'=>'0030',
'пп'=>'210',
'пра'=>'2000',
'про'=>'0200',
'пт'=>'210',
'ра'=>'100',
'раб'=>'2000',
'рб'=>'010',
'рв'=>'010',
'рг'=>'210',
'рд'=>'210',
'ри'=>'100',
'рл'=>'210',
'рм'=>'010',
'рн'=>'210',
'ро'=>'100',
'рө'=>'100',
'рп'=>'010',
'рр'=>'010',
'рс'=>'210',
'рт'=>'210',
'ру'=>'100',
'рук'=>'2000',
'рү'=>'100',
'рх'=>'210',
'рц'=>'010',
'рч'=>'200',
'рш'=>'210',
'ры'=>'100',
'рэ'=>'100',
'са'=>'100',
'сб'=>'010',
'св'=>'210',
'сг'=>'210',
'сд'=>'210',
'се'=>'100',
'сж'=>'210',
'сз'=>'010',
'си'=>'100',
'ск'=>'102',
'скв'=>'2000',
'сл'=>'210',
'см'=>'210',
'сн'=>'210',
'со'=>'100',
'сө'=>'100',
'сп'=>'010',
'спе'=>'0200',
'спи'=>'0200',
'ср'=>'210',
'сс'=>'210',
'ст'=>'210',
'су'=>'100',
'сү'=>'100',
'сф'=>'010',
'сх'=>'210',
'сц'=>'010',
'сч'=>'210',
'сшт'=>'0320',
'сы'=>'100',
'сэ'=>'100',
'та'=>'100',
'тб'=>'210',
'тв'=>'210',
'тг'=>'210',
'тд'=>'210',
'тж'=>'210',
'тз'=>'210',
'ти'=>'100',
'тл'=>'210',
'тм'=>'210',
'тн'=>'210',
'то'=>'100',
'тө'=>'100',
'тр'=>'210',
'тро'=>'0200',
'тру'=>'1000',
'тс'=>'210',
'тт'=>'210',
'тү'=>'100',
'тх'=>'210',
'тц'=>'210',
'тч'=>'210',
'тш'=>'210',
'ты'=>'100',
'тэ'=>'100',
'уд'=>'010',
'ужи'=>'0200',
'уз'=>'010',
'ул'=>'010',
'ут'=>'010',
'уф'=>'010',
'ух'=>'010',
'уц'=>'010',
'уш'=>'010',
'үд'=>'010',
'үз'=>'010',
'үзэ'=>'0200',
'үл'=>'010',
'үп'=>'010',
'үсд'=>'0020',
'үх'=>'010',
'үц'=>'010',
'үш'=>'010',
'фд'=>'010',
'фм'=>'010',
'фо'=>'100',
'ха'=>'100',
'хаады'=>'000200',
'хаю'=>'0020',
'хб'=>'210',
'хв'=>'210',
'хг'=>'210',
'хд'=>'210',
'хж'=>'210',
'хз'=>'210',
'хи'=>'100',
'хида'=>'00200',
'хиı'=>'2000',
'хл'=>'210',
'хм'=>'210',
'хн'=>'210',
'хо'=>'100',
'хө'=>'100',
'хр'=>'210',
'хс'=>'210',
'хт'=>'210',
'ху'=>'100',
'хуж'=>'0030',
'хү'=>'100',
'хх'=>'210',
'хц'=>'210',
'хч'=>'200',
'хш'=>'210',
'хы'=>'100',
'хı'=>'200',
'хэ'=>'100',
'ца'=>'100',
'цв'=>'210',
'цг'=>'210',
'цд'=>'210',
'цж'=>'210',
'цл'=>'210',
'цм'=>'210',
'цн'=>'210',
'цр'=>'210',
'цс'=>'210',
'цт'=>'210',
'цх'=>'210',
'цч'=>'210',
'цъ'=>'200',
'ча'=>'100',
'чв'=>'010',
'чг'=>'210',
'чд'=>'210',
'чи'=>'100',
'чл'=>'210',
'чм'=>'210',
'чн'=>'210',
'чо'=>'100',
'чр'=>'210',
'чс'=>'210',
'чт'=>'210',
'чу'=>'100',
'чү'=>'100',
'чх'=>'210',
'чэ'=>'100',
'ша'=>'100',
'шб'=>'010',
'шв'=>'210',
'шг'=>'210',
'шд'=>'210',
'шж'=>'210',
'ши'=>'100',
'шк'=>'210',
'шл'=>'210',
'шм'=>'210',
'шн'=>'210',
'шо'=>'100',
'шө'=>'100',
'шр'=>'210',
'шс'=>'210',
'шт'=>'210',
'шу'=>'100',
'шү'=>'100',
'шүүлı'=>'000300',
'шх'=>'210',
'шч'=>'210',
'шэ'=>'100',
'ъе'=>'012',
'ъё'=>'012',
'ъя'=>'012',
'ыг'=>'010',
'ыс'=>'010',
'ых'=>'010',
'ıб'=>'010',
'ıд'=>'010',
'ıк'=>'010',
'ıт'=>'010',
'ıх'=>'010',
'ıц'=>'010',
'ıч'=>'010',
'ıш'=>'010',
'ıя'=>'012',
'эд'=>'010',
'эж'=>'010',
'эз'=>'010',
'энэхи'=>'020000',
'эх'=>'010',
'эц'=>'010',
'юд'=>'200',
'яа'=>'100',
'яд'=>'010',
'яншд'=>'00020',
'ят'=>'010',
'ях'=>'010',
'яш'=>'010'
)
);
?>

27307
php-typography/lang/no.php Normal file

File diff suppressed because it is too large Load Diff

4194
php-typography/lang/pl.php Normal file

File diff suppressed because it is too large Load Diff

426
php-typography/lang/pt.php Normal file
View File

@ -0,0 +1,426 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-pt.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: pthyph.tex (1994-10-13 - date on CTAN) or (1996-07-21 - date in file) - no idea
% Author: Pedro J. de Rezende <rezende at dcc.unicamp.br>, J.Joao Dias Almeida <jj at di.uminho.pt>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The Portuguese TeX hyphenation table.
% (C) 1996 by Pedro J. de Rezende (rezende@dcc.unicamp.br)
% and J.Joao Dias Almeida (jj@di.uminho.pt)
% Version: 1.2 Release date: 21/07/96
%
% (C) 1994 by Pedro J. de Rezende (rezende@dcc.unicamp.br)
% Version: 1.1 Release date: 04/12/94
%
% (C) 1987 by Pedro J. de Rezende
% Version: 1.0 Release date: 02/13/87
%
% -----------------------------------------------------------------
% IMPORTANT NOTICE:
%
% This program can be redistributed and/or modified under the terms
% of the LaTeX Project Public License Distributed from CTAN
% archives in directory macros/latex/base/lppl.txt; either
% version 1 of the License, or any later version.
% -----------------------------------------------------------------
% Remember! If you *must* change it, then call the resulting file
% something else and attach your name to your *documented* changes.
% ======================================================================
//============================================================================================================
*/
$patgenLanguage = 'Portuguese';
$patgenExceptions = array(
'hardware'=>'hard-ware',
'software'=>'soft-ware'
);
$patgenMaxSeg = 3;
$patgen = array(
'begin'=>array(),
'end'=>array(),
'all'=>array(
'bl'=>'120',
'br'=>'120',
'ba'=>'100',
'be'=>'100',
'bi'=>'100',
'bo'=>'100',
'bu'=>'100',
'bá'=>'100',
'bâ'=>'100',
'bã'=>'100',
'bé'=>'100',
'bí'=>'100',
'bó'=>'100',
'bú'=>'100',
'bê'=>'100',
'bõ'=>'100',
'ch'=>'120',
'cl'=>'120',
'cr'=>'120',
'ca'=>'100',
'ce'=>'100',
'ci'=>'100',
'co'=>'100',
'cu'=>'100',
'cá'=>'100',
'câ'=>'100',
'cã'=>'100',
'cé'=>'100',
'cí'=>'100',
'có'=>'100',
'cú'=>'100',
'cê'=>'100',
'cõ'=>'100',
'ça'=>'100',
'çe'=>'100',
'çi'=>'100',
'ço'=>'100',
'çu'=>'100',
'çá'=>'100',
'çâ'=>'100',
'çã'=>'100',
'çé'=>'100',
'çí'=>'100',
'çó'=>'100',
'çú'=>'100',
'çê'=>'100',
'çõ'=>'100',
'dl'=>'120',
'dr'=>'120',
'da'=>'100',
'de'=>'100',
'di'=>'100',
'do'=>'100',
'du'=>'100',
'dá'=>'100',
'dâ'=>'100',
'dã'=>'100',
'dé'=>'100',
'dí'=>'100',
'dó'=>'100',
'dú'=>'100',
'dê'=>'100',
'dõ'=>'100',
'fl'=>'120',
'fr'=>'120',
'fa'=>'100',
'fe'=>'100',
'fi'=>'100',
'fo'=>'100',
'fu'=>'100',
'fá'=>'100',
'fâ'=>'100',
'fã'=>'100',
'fé'=>'100',
'fí'=>'100',
'fó'=>'100',
'fú'=>'100',
'fê'=>'100',
'fõ'=>'100',
'gl'=>'120',
'gr'=>'120',
'ga'=>'100',
'ge'=>'100',
'gi'=>'100',
'go'=>'100',
'gu'=>'100',
'gua'=>'1040',
'gue'=>'1040',
'gui'=>'1040',
'guo'=>'1040',
'gá'=>'100',
'gâ'=>'100',
'gã'=>'100',
'gé'=>'100',
'gí'=>'100',
'gó'=>'100',
'gú'=>'100',
'gê'=>'100',
'gõ'=>'100',
'ja'=>'100',
'je'=>'100',
'ji'=>'100',
'jo'=>'100',
'ju'=>'100',
'já'=>'100',
'jâ'=>'100',
'jã'=>'100',
'jé'=>'100',
'jí'=>'100',
'jó'=>'100',
'jú'=>'100',
'jê'=>'100',
'jõ'=>'100',
'kl'=>'120',
'kr'=>'120',
'ka'=>'100',
'ke'=>'100',
'ki'=>'100',
'ko'=>'100',
'ku'=>'100',
'ká'=>'100',
'kâ'=>'100',
'kã'=>'100',
'ké'=>'100',
'kí'=>'100',
'kó'=>'100',
'kú'=>'100',
'kê'=>'100',
'kõ'=>'100',
'lh'=>'120',
'la'=>'100',
'le'=>'100',
'li'=>'100',
'lo'=>'100',
'lu'=>'100',
'lá'=>'100',
'lâ'=>'100',
'lã'=>'100',
'lé'=>'100',
'lí'=>'100',
'ló'=>'100',
'lú'=>'100',
'lê'=>'100',
'lõ'=>'100',
'ma'=>'100',
'me'=>'100',
'mi'=>'100',
'mo'=>'100',
'mu'=>'100',
'má'=>'100',
'mâ'=>'100',
'mã'=>'100',
'mé'=>'100',
'mí'=>'100',
'mó'=>'100',
'mú'=>'100',
'mê'=>'100',
'mõ'=>'100',
'nh'=>'120',
'na'=>'100',
'ne'=>'100',
'ni'=>'100',
'no'=>'100',
'nu'=>'100',
'ná'=>'100',
'nâ'=>'100',
'nã'=>'100',
'né'=>'100',
'ní'=>'100',
'nó'=>'100',
'nú'=>'100',
'nê'=>'100',
'nõ'=>'100',
'pl'=>'120',
'pr'=>'120',
'pa'=>'100',
'pe'=>'100',
'pi'=>'100',
'po'=>'100',
'pu'=>'100',
'pá'=>'100',
'pâ'=>'100',
'pã'=>'100',
'pé'=>'100',
'pí'=>'100',
'pó'=>'100',
'pú'=>'100',
'pê'=>'100',
'põ'=>'100',
'qua'=>'1040',
'que'=>'1040',
'qui'=>'1040',
'quo'=>'1040',
'ra'=>'100',
're'=>'100',
'ri'=>'100',
'ro'=>'100',
'ru'=>'100',
'rá'=>'100',
'râ'=>'100',
'rã'=>'100',
'ré'=>'100',
'rí'=>'100',
'ró'=>'100',
'rú'=>'100',
'rê'=>'100',
'rõ'=>'100',
'sa'=>'100',
'se'=>'100',
'si'=>'100',
'so'=>'100',
'su'=>'100',
'sá'=>'100',
'sâ'=>'100',
'sã'=>'100',
'sé'=>'100',
'sí'=>'100',
'só'=>'100',
'sú'=>'100',
'sê'=>'100',
'sõ'=>'100',
'tl'=>'120',
'tr'=>'120',
'ta'=>'100',
'te'=>'100',
'ti'=>'100',
'to'=>'100',
'tu'=>'100',
'tá'=>'100',
'tâ'=>'100',
'tã'=>'100',
'té'=>'100',
'tí'=>'100',
'tó'=>'100',
'tú'=>'100',
'tê'=>'100',
'tõ'=>'100',
'vl'=>'120',
'vr'=>'120',
'va'=>'100',
've'=>'100',
'vi'=>'100',
'vo'=>'100',
'vu'=>'100',
'vá'=>'100',
'vâ'=>'100',
'vã'=>'100',
'vé'=>'100',
'ví'=>'100',
'vó'=>'100',
'vú'=>'100',
'vê'=>'100',
'võ'=>'100',
'wl'=>'120',
'wr'=>'120',
'xa'=>'100',
'xe'=>'100',
'xi'=>'100',
'xo'=>'100',
'xu'=>'100',
'xá'=>'100',
'xâ'=>'100',
'xã'=>'100',
'xé'=>'100',
'xí'=>'100',
'xó'=>'100',
'xú'=>'100',
'xê'=>'100',
'xõ'=>'100',
'za'=>'100',
'ze'=>'100',
'zi'=>'100',
'zo'=>'100',
'zu'=>'100',
'zá'=>'100',
'zâ'=>'100',
'zã'=>'100',
'zé'=>'100',
'zí'=>'100',
'zó'=>'100',
'zú'=>'100',
'zê'=>'100',
'zõ'=>'100',
'aa'=>'030',
'ae'=>'030',
'ao'=>'030',
'cc'=>'030',
'ea'=>'030',
'ee'=>'030',
'eo'=>'030',
'ia'=>'030',
'ie'=>'030',
'ii'=>'030',
'io'=>'030',
'iâ'=>'030',
'iê'=>'030',
'iô'=>'030',
'oa'=>'030',
'oe'=>'030',
'oo'=>'030',
'rr'=>'030',
'ss'=>'030',
'ua'=>'030',
'ue'=>'030',
'uo'=>'030',
'uu'=>'030',
'-'=>'10'
)
);
?>

789
php-typography/lang/ro.php Normal file
View File

@ -0,0 +1,789 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-ro.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: rohyphen.tex (1996-11-11)
% Author: drian Rezus <adriaan at {sci,cs}.kun.nl>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% ROHYPHEN.TEX, version 1.1 <29.10.1996> R [7.11.1996] %%
%% (C) 1995-1996 Adrian Rezus [adriaan@{sci,cs}.kun.nl] %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
%% Romanian TeX hyphenation table: NFSS 2 encoding, medium.
%% Contents: 647 Romanian hyphen patterns, with diacritics.
%%
%% This file is part of the Romanian TeX system.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Romanian TeX, version 1.3R <29.10.1996> %%
%% (C) 1994-1996 Adrian Rezus %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% History:
%% ROHYPHEN.TEX 1.0 <10.02.1995>: Plain TeX and LaTeX 2.09.
%% ROHYPHEN.TEX 1.1 <29.10.1996>: Plain TeX and LaTeX2e.
%
% -------------------------------------------------------------------
% TODO: fix the notice below - it only holds for the old patterns
% NB This file must be used in conjunction with either one of
%
% (1) ROMANIAN.TEX v1.2(R) [1994-1995] [(La)TeX] or
% (2) ROMANIAN.STY v1.3R [1996] [(La)TeX(2e)]
%
% NB Romanian has LR-HYPHEN-MINs [2 2] (like German)!
% NB Romanian has STRUCTURAL HYPHEN-AMBIGUA:
% i.e., words that canNOT be hyphenated correctly without
% additional (e.g., semantic, stress-mark) information.
% --------------------------------------------------------
% The Romanian TeX encoding of the Romanian diacritics:
% --------------------------------------------------------
% Romanian TeX DQ-macro encodings = (La)TeX macros
% --------------------------------------------------------
% ă = \u{a} [-] \u{A} [not encoded]
% â = \^{a} [-] \^{A} [not encoded]
% î = \^{\i} 'I = \^{I}
% ș = \c{s} 'S = \c{S}
% ț = \c{t} 'T = \c{T}
% -------------------------------------------------------------
% NB Romanian \^{a} behaves like \^{\i} as regards hyphenation.
% NB The capital \u{A} and \^{A} are rare in script; as such,
% they occur only in records of the Romanian substandard.
% -------------------------------------------------------------------
%
% original patterns generated by PatGen2-output hyphen-level 9: do NOT modify the list by hand!
//============================================================================================================
*/
$patgenLanguage = 'Romanian';
$patgenExceptions = array();
$patgenMaxSeg = 7;
$patgen = array(
'begin'=>array(
'aic'=>'0300',
'anis'=>'04300',
'az'=>'020',
'cre'=>'0001',
'deaj'=>'00200',
'dez'=>'0021',
'g'=>'04',
'ia'=>'020',
'ie'=>'020',
'iț'=>'030',
'iu'=>'043',
'iv'=>'030',
'îm'=>'040',
'n'=>'02',
'ni'=>'002',
'p'=>'04',
'preș'=>'00030',
's'=>'04',
'ș'=>'04',
'ui'=>'040',
'uni'=>'0500',
'z'=>'02'
),
'end'=>array(
'an'=>'200',
'ăti'=>'0200',
'b'=>'20',
'bia'=>'0020',
'bține'=>'000400',
'c'=>'40',
'chi'=>'2000',
'ci'=>'200',
'd'=>'40',
'f'=>'20',
'fi'=>'200',
'g'=>'20',
'ghi'=>'2000',
'gi'=>'200',
'h'=>'20',
'hi'=>'200',
'i'=>'40',
'j'=>'20',
'ji'=>'200',
'l'=>'40',
'li'=>'400',
'm'=>'20',
'mi'=>'400',
'n'=>'40',
'ni'=>'400',
'obi'=>'0200',
'omedie'=>'0000020',
'orte'=>'00200',
'p'=>'20',
'pi'=>'200',
'pie'=>'0030',
'pți'=>'0400',
'r'=>'40',
'ri'=>'400',
's'=>'40',
'sc'=>'400',
'see'=>'0040',
'ș'=>'40',
'și'=>'400',
'ști'=>'4000',
't'=>'40',
'ti'=>'400',
'tii'=>'3000',
'tî'=>'200',
'tru'=>'3000',
'ț'=>'20',
'ți'=>'200',
'ția'=>'0030',
'u'=>'60',
'ua'=>'020',
'v'=>'20',
'vi'=>'200',
'x'=>'20',
'z'=>'20',
'zi'=>'200'
),
'all'=>array(
'a'=>'01',
'acă'=>'2000',
'achi'=>'00005',
'ae'=>'030',
'afo'=>'0003',
'aia'=>'0320',
'aie'=>'0320',
'ail'=>'0300',
'ais'=>'0032',
'aiu'=>'0300',
'alie'=>'00006',
'alt'=>'2000',
'am'=>'020',
'an'=>'020',
'ane'=>'0520',
'anie'=>'00020',
'aniș'=>'00034',
'ans'=>'0040',
'anu'=>'2000',
'anz'=>'0020',
'aog'=>'0020',
'atia'=>'00040',
'atr'=>'2000',
'atu'=>'0540',
'ața'=>'2000',
'ață'=>'2000',
'au'=>'200',
'aua'=>'0300',
'aud'=>'0300',
'aug'=>'0300',
'aul'=>'0300',
'aun'=>'0300',
'aur'=>'0300',
'aus'=>'0300',
'aute'=>'03000',
'auț'=>'0320',
'auz'=>'0300',
'ă'=>'21',
'ăi'=>'030',
'ăie'=>'0020',
'ăm'=>'022',
'ănu'=>'0003',
'ărgi'=>'00005',
'ăș'=>'030',
'ășt'=>'0430',
'ătie'=>'00040',
'ău'=>'030',
'ăv'=>'030',
'ăzi'=>'0200',
'b'=>'10',
'baț'=>'0020',
'bănu'=>'00005',
'bc'=>'200',
'bd'=>'200',
'biat'=>'00200',
'bie'=>'0020',
'bii'=>'3000',
'bl'=>'020',
'blim'=>'34000',
'blu'=>'0400',
'bo'=>'001',
'boric'=>'003000',
'bs'=>'200',
'bt'=>'200',
'bț'=>'200',
'bu'=>'003',
'c'=>'10',
'caut'=>'00300',
'căc'=>'0020',
'cătu'=>'00005',
'cc'=>'200',
'cea'=>'0020',
'ceț'=>'0020',
'ciale'=>'003000',
'cio'=>'0020',
'cis'=>'0002',
'cisp'=>'00300',
'ciza'=>'00002',
'cl'=>'040',
'cm'=>'200',
'cn'=>'250',
'copiată'=>'00000200',
'coț'=>'0020',
'cs'=>'200',
'ct'=>'200',
'cț'=>'200',
'cuim'=>'00300',
'cul'=>'3000',
'cuț'=>'0020',
'cv'=>'200',
'd'=>'10',
'dam'=>'0040',
'daț'=>'0020',
'dc'=>'200',
'desc'=>'00400',
'dezin'=>'000300',
'dian'=>'00200',
'diată'=>'000200',
'dj'=>'200',
'dm'=>'200',
'dn'=>'210',
'doil'=>'00400',
'du'=>'300',
'eac'=>'0100',
'eaj'=>'0100',
'eal'=>'0100',
'eaș'=>'0100',
'eat'=>'0100',
'eaț'=>'0020',
'eav'=>'0100',
'ebui'=>'00050',
'ec'=>'200',
'ecia'=>'00020',
'eclare'=>'0000200',
'ediulu'=>'0004000',
'ee'=>'030',
'eea'=>'0020',
'efa'=>'1000',
'eh'=>'010',
'eia'=>'0320',
'eie'=>'0320',
'eii'=>'0300',
'eil'=>'0300',
'eim'=>'0300',
'ein'=>'0300',
'eio'=>'0320',
'eis'=>'0332',
'eit'=>'0300',
'eiu'=>'0340',
'eî'=>'010',
'el'=>'200',
'em'=>'020',
'emon'=>'00005',
'en'=>'200',
'ene'=>'0500',
'eo'=>'011',
'eon'=>'0300',
'er'=>'010',
'era'=>'2000',
'eră'=>'2000',
'erc'=>'2000',
'es'=>'220',
'esco'=>'00300',
'esti'=>'00500',
'eș'=>'200',
'eși'=>'0300',
'etanț'=>'000040',
'eț'=>'200',
'eu'=>'030',
'euș'=>'0050',
'evit'=>'10000',
'ex'=>'020',
'ez'=>'200',
'eză'=>'0005',
'ezia'=>'00030',
'ezo'=>'0210',
'f'=>'14',
'fa'=>'300',
'făș'=>'3000',
'fie'=>'0030',
'fo'=>'300',
'ft'=>'200',
'ftu'=>'0500',
'g'=>'12',
'găț'=>'0030',
'gl'=>'040',
'gm'=>'230',
'gn'=>'230',
'gon'=>'0050',
'gu'=>'303',
'gv'=>'230',
'hia'=>'0020',
'hic'=>'0030',
'hiu'=>'0040',
'hn'=>'210',
'i'=>'21',
'iac'=>'3200',
'iag'=>'0034',
'iai'=>'0200',
'iaș'=>'0200',
'iaț'=>'0020',
'ică'=>'0300',
'ied'=>'0200',
'iia'=>'0300',
'iie'=>'0300',
'iii'=>'0300',
'iil'=>'0300',
'iin'=>'0300',
'iir'=>'0300',
'iit'=>'0300',
'iitură'=>'0000200',
'iî'=>'020',
'ila'=>'4000',
'ile'=>'0300',
'ilo'=>'0300',
'imateri'=>'00000006',
'in'=>'020',
'ined'=>'04100',
'ingă'=>'00200',
'inții'=>'000040',
'inv'=>'3000',
'iod'=>'0300',
'ioni'=>'03000',
'ioț'=>'0020',
'ipă'=>'0005',
'is'=>'020',
'isf'=>'0030',
'isp'=>'4000',
'ișt'=>'0030',
'iti'=>'0500',
'iția'=>'00020',
'ițio'=>'03020',
'iua'=>'0300',
'iul'=>'0300',
'ium'=>'0300',
'iund'=>'03000',
'iunu'=>'03000',
'ius'=>'0300',
'iut'=>'0300',
'izv'=>'0030',
'î'=>'02',
'îd'=>'030',
'îe'=>'030',
'îlo'=>'0300',
'îna'=>'0003',
'înș'=>'0050',
'îri'=>'0300',
'îrî'=>'0300',
'îrș'=>'0050',
'îșt'=>'0030',
'ît'=>'030',
'îti'=>'0400',
'îț'=>'030',
'îți'=>'0400',
'îții'=>'05000',
'îz'=>'030',
'j'=>'10',
'jd'=>'200',
'jiț'=>'0020',
'jl'=>'200',
'ju'=>'040',
'jut'=>'0030',
'k'=>'10',
'l'=>'10',
'larați'=>'0000002',
'lăti'=>'00200',
'lătu'=>'00005',
'lb'=>'200',
'lc'=>'200',
'ld'=>'200',
'lea'=>'0020',
'lf'=>'200',
'lg'=>'200',
'lia'=>'0030',
'lie'=>'0030',
'lio'=>'0030',
'lm'=>'200',
'ln'=>'250',
'lp'=>'200',
'ls'=>'200',
'lș'=>'230',
'lt'=>'200',
'lț'=>'200',
'lu'=>'300',
'lv'=>'200',
'm'=>'10',
'ma'=>'300',
'mă'=>'300',
'mb'=>'200',
'mblîn'=>'000003',
'me'=>'300',
'mez'=>'0020',
'mf'=>'200',
'mi'=>'300',
'miț'=>'0020',
'mî'=>'300',
'mn'=>'210',
'mo'=>'300',
'mon'=>'0004',
'mp'=>'200',
'ms'=>'232',
'mt'=>'200',
'mț'=>'200',
'mu'=>'300',
'muț'=>'0020',
'mv'=>'200',
'na'=>'300',
'nad'=>'4100',
'nain'=>'00300',
'nă'=>'300',
'nc'=>'200',
'ncis'=>'02000',
'nciz'=>'02000',
'nd'=>'200',
'ne'=>'300',
'neab'=>'00100',
'nean'=>'00100',
'neap'=>'00100',
'nef'=>'4000',
'neg'=>'4100',
'nes'=>'0032',
'nevi'=>'40000',
'nex'=>'4100',
'ng'=>'200',
'ngăt'=>'00300',
'ni'=>'300',
'niez'=>'00300',
'nî'=>'300',
'nj'=>'030',
'nn'=>'010',
'no'=>'300',
'noș'=>'0040',
'nr'=>'010',
'ns'=>'232',
'nsf'=>'0030',
'nsî'=>'0400',
'nspo'=>'00300',
'nș'=>'032',
'nși'=>'0400',
'nt'=>'200',
'nti'=>'0500',
'ntu'=>'0540',
'nț'=>'200',
'nu'=>'500',
'nua'=>'0030',
'nuă'=>'0030',
'num'=>'0050',
'nus'=>'0032',
'nz'=>'200',
'oag'=>'0100',
'oal'=>'0200',
'oca'=>'2000',
'ocui'=>'00050',
'od'=>'200',
'odia'=>'00020',
'oe'=>'030',
'oi'=>'032',
'oiecti'=>'0000002',
'oisp'=>'00320',
'omn'=>'0040',
'on'=>'200',
'oo'=>'010',
'opie'=>'00030',
'opla'=>'00002',
'oplagi'=>'0000002',
'ora'=>'0100',
'oră'=>'0100',
'orc'=>'0020',
'ore'=>'0100',
'ori'=>'0100',
'oric'=>'02000',
'orî'=>'0100',
'oro'=>'0100',
'oru'=>'0100',
'osti'=>'00500',
'oși'=>'0300',
'otați'=>'000004',
'oti'=>'0500',
'otod'=>'00300',
'ou'=>'030',
'p'=>'12',
'pa'=>'300',
'părț'=>'00030',
'pc'=>'230',
'pecți'=>'000002',
'peț'=>'0020',
'pie'=>'0020',
'piez'=>'00300',
'pio'=>'0030',
'piț'=>'0020',
'piz'=>'0020',
'pl'=>'040',
'poș'=>'0040',
'poț'=>'0020',
'ps'=>'230',
'pș'=>'230',
'pt'=>'230',
'pț'=>'230',
'pub'=>'0034',
'purie'=>'000020',
'puș'=>'0040',
'rb'=>'200',
'rc'=>'200',
'rd'=>'200',
're'=>'020',
'rebi'=>'00200',
'recizi'=>'0000002',
'rescr'=>'003200',
'reși'=>'00400',
'rf'=>'200',
'rg'=>'200',
'rh'=>'210',
'ria'=>'0030',
'riali'=>'004000',
'rieț'=>'00300',
'riez'=>'00300',
'rimi'=>'00500',
'riun'=>'20300',
'riv'=>'0030',
'rk'=>'200',
'rl'=>'200',
'rm'=>'200',
'rn'=>'210',
'rnaț'=>'00020',
'rografi'=>'00000006',
'rp'=>'200',
'rr'=>'210',
'rs'=>'202',
'rsp'=>'0300',
'rst'=>'0300',
'rș'=>'230',
'rt'=>'200',
'rtuale'=>'0000200',
'rț'=>'200',
'ruil'=>'00300',
'rusp'=>'00300',
'rv'=>'200',
'rz'=>'200',
's'=>'10',
'sa'=>'500',
'să'=>'500',
'săm'=>'0040',
'săș'=>'0040',
'sc'=>'200',
'sco'=>'3200',
'se'=>'300',
'sea'=>'0020',
'ses'=>'0002',
'sesp'=>'00300',
'seș'=>'0040',
'sf'=>'420',
'sfî'=>'5000',
'si'=>'300',
'sip'=>'0030',
'sî'=>'300',
'sl'=>'340',
'sm'=>'400',
'sn'=>'010',
'so'=>'300',
'soric'=>'003000',
'sp'=>'200',
'st'=>'200',
'sto'=>'0003',
'su'=>'500',
'suț'=>'0020',
'ș'=>'20',
'șa'=>'300',
'șaț'=>'0020',
'șă'=>'302',
'șe'=>'300',
'și'=>'100',
'șii'=>'5000',
'șil'=>'5000',
'șin'=>'3000',
'șî'=>'300',
'șn'=>'450',
'șnu'=>'0005',
'șo'=>'300',
'șp'=>'020',
'ști'=>'0200',
'ștr'=>'4300',
'șu'=>'300',
't'=>'12',
'taut'=>'00300',
'tc'=>'230',
'td'=>'230',
'tea'=>'0020',
'teni'=>'00500',
'terială'=>'00006000',
'tesp'=>'00320',
'tf'=>'230',
'tia'=>'0030',
'tie'=>'0030',
'til'=>'3000',
'tin'=>'3000',
'tiț'=>'0020',
'tl'=>'040',
'tm'=>'230',
'tol'=>'3000',
'tor'=>'3000',
'toto'=>'00200',
'trul'=>'30000',
'truo'=>'30000',
'ts'=>'432',
'tt'=>'230',
'tua'=>'0030',
'tuim'=>'00300',
'tun'=>'4300',
'tuș'=>'0040',
'tz'=>'430',
'ț'=>'10',
'ța'=>'300',
'ță'=>'300',
'țeț'=>'0020',
'ția'=>'3000',
'ție'=>'3000',
'ții'=>'3000',
'țil'=>'3000',
'țiț'=>'0020',
'țiu'=>'3000',
'țu'=>'003',
'țui'=>'0050',
'u'=>'21',
'uad'=>'0200',
'uau'=>'0300',
'uă'=>'003',
'uăs'=>'0002',
'ubia'=>'02000',
'ubl'=>'0230',
'ubo'=>'0210',
'ubs'=>'0032',
'ue'=>'030',
'ugu'=>'4000',
'uia'=>'0320',
'uie'=>'0320',
'uin'=>'0300',
'uir'=>'0300',
'uis'=>'0300',
'uit'=>'0300',
'uiț'=>'0320',
'uiz'=>'0300',
'ul'=>'020',
'ula'=>'0300',
'ulă'=>'0300',
'ule'=>'0300',
'ulii'=>'03000',
'ulî'=>'0300',
'ulo'=>'0300',
'umir'=>'00050',
'urz'=>'0020',
'us'=>'020',
'uspr'=>'00200',
'ust'=>'0400',
'uș'=>'030',
'ușt'=>'0400',
'uto'=>'0200',
'utor'=>'30000',
'uui'=>'0300',
'uum'=>'0300',
'v'=>'10',
'veni'=>'00500',
'veț'=>'0020',
'vez'=>'0020',
'viț'=>'0020',
'vn'=>'210',
'vorbito'=>'00000002',
'vr'=>'300',
'x'=>'10',
'xa'=>'300',
'xă'=>'300',
'xe'=>'300',
'xez'=>'0020',
'xi'=>'300',
'xo'=>'300',
'xu'=>'300',
'z'=>'10',
'zaț'=>'0020',
'zb'=>'200',
'zg'=>'220',
'zian'=>'00200',
'ziar'=>'00200',
'zii'=>'3000',
'zil'=>'3000',
'zm'=>'040',
'zn'=>'210',
'zol'=>'3200',
'zon'=>'3000',
'zuț'=>'0020',
'zv'=>'220',
'zvă'=>'0300'
)
);
?>

5104
php-typography/lang/ru.php Normal file

File diff suppressed because it is too large Load Diff

697
php-typography/lang/sa.php Normal file
View File

@ -0,0 +1,697 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-sa.tex
//============================================================================================================
ORIGINAL FILE INFO
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% File name: hyph-sa.tex
%
% Unicode hyphenation patterns for Sanskrit and Prakrit in Devanagari,
% Bengali, Gujarati, Kannada, Malayalam and Telugu scripts.
%
% Created: April 1st, 2005
% First release: June 8th, 2006
% Revised: October 3rd, 2008
% Version: 0.3
%
% Created by Yves Codet with Jonathan Kew's help.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
//============================================================================================================
*/
$patgenLanguage = 'Sanskrit';
$patgenExceptions = array();
$patgenMaxSeg = 4;
$patgen = array(
'begin'=>array(),
'end'=>array(
'क्'=>'200',
'ख्'=>'200',
'ग्'=>'200',
'घ्'=>'200',
'ङ्'=>'200',
'च्'=>'200',
'छ्'=>'200',
'ज्'=>'200',
'झ्'=>'200',
'ञ्'=>'200',
'ट्'=>'200',
'ठ्'=>'200',
'ड्'=>'200',
'ढ्'=>'200',
'ण्'=>'200',
'त्'=>'200',
'थ्'=>'200',
'द्'=>'200',
'ध्'=>'200',
'न्'=>'200',
'प्'=>'200',
'फ्'=>'200',
'ब्'=>'200',
'भ्'=>'200',
'म्'=>'200',
'य्'=>'200',
'र्'=>'200',
'ल्'=>'200',
'ळ्'=>'200',
'व्'=>'200',
'श्'=>'200',
'ष्'=>'200',
'स्'=>'200',
'ह्'=>'200',
'र्क्'=>'20000',
'र्ट्'=>'20000',
'र्त्'=>'20000',
'र्प्'=>'20000',
'ক্'=>'200',
'খ্'=>'200',
'গ্'=>'200',
'ঘ্'=>'200',
'ঙ্'=>'200',
'চ্'=>'200',
'ছ্'=>'200',
'জ্'=>'200',
'ঝ্'=>'200',
'ঞ্'=>'200',
'ট্'=>'200',
'ঠ্'=>'200',
'ড্'=>'200',
'ড়্'=>'2000',
'ঢ্'=>'200',
'ঢ়্'=>'2000',
'ণ্'=>'200',
'ত্'=>'200',
'থ্'=>'200',
'দ্'=>'200',
'ধ্'=>'200',
'ন্'=>'200',
'প্'=>'200',
'ফ্'=>'200',
'ব্'=>'200',
'ভ্'=>'200',
'ম্'=>'200',
'য্'=>'200',
'য়্'=>'2000',
'র্'=>'200',
'ল্'=>'200',
'শ্'=>'200',
'ষ্'=>'200',
'স্'=>'200',
'হ্'=>'200',
'র্ক'=>'2000',
'র্ট'=>'2000',
'র্ত'=>'2000',
'র্প'=>'2000',
'ક્'=>'200',
'ખ્'=>'200',
'ગ્'=>'200',
'ઘ્'=>'200',
'ઙ્'=>'200',
'ચ્'=>'200',
'છ્'=>'200',
'જ્'=>'200',
'ઝ્'=>'200',
'ઞ્'=>'200',
'ટ્'=>'200',
'ઠ્'=>'200',
'ડ્'=>'200',
'ઢ્'=>'200',
'ણ્'=>'200',
'ત્'=>'200',
'થ્'=>'200',
'દ્'=>'200',
'ધ્'=>'200',
'ન્'=>'200',
'પ્'=>'200',
'ફ્'=>'200',
'બ્'=>'200',
'ભ્'=>'200',
'મ્'=>'200',
'ય્'=>'200',
'ર્'=>'200',
'લ્'=>'200',
'ળ્'=>'200',
'વ્'=>'200',
'શ્'=>'200',
'ષ્'=>'200',
'સ્'=>'200',
'હ્'=>'200',
'ર્ક'=>'2000',
'ર્ટ'=>'2000',
'ર્ત'=>'2000',
'ર્પ'=>'2000',
'ಕ್'=>'200',
'ಖ್'=>'200',
'ಗ್'=>'200',
'ಘ್'=>'200',
'ಙ್'=>'200',
'ಚ್'=>'200',
'ಛ್'=>'200',
'ಜ್'=>'200',
'ಝ್'=>'200',
'ಞ್'=>'200',
'ಟ್'=>'200',
'ಠ್'=>'200',
'ಡ್'=>'200',
'ಢ್'=>'200',
'ಣ್'=>'200',
'ತ್'=>'200',
'ಥ್'=>'200',
'ದ್'=>'200',
'ಧ್'=>'200',
'ನ್'=>'200',
'ಪ್'=>'200',
'ಫ್'=>'200',
'ಬ್'=>'200',
'ಭ್'=>'200',
'ಮ್'=>'200',
'ಯ್'=>'200',
'ರ್'=>'200',
'ಱ್'=>'200',
'ಲ್'=>'200',
'ಳ್'=>'200',
'ವ್'=>'200',
'ಶ್'=>'200',
'ಷ್'=>'200',
'ಸ್'=>'200',
'ಹ್'=>'200',
'ರ್ಕ'=>'2000',
'ರ್ಟ'=>'2000',
'ರ್ತ'=>'2000',
'ರ್ಪ'=>'2000',
'ക്'=>'200',
'ഖ്'=>'200',
'ഗ്'=>'200',
'ഘ്'=>'200',
'ങ്'=>'200',
'ച്'=>'200',
'ഛ്'=>'200',
'ജ്'=>'200',
'ഝ്'=>'200',
'ഞ്'=>'200',
'ട്'=>'200',
'ഠ്'=>'200',
'ഡ്'=>'200',
'ഢ്'=>'200',
'ണ്'=>'200',
'ത്'=>'200',
'ഥ്'=>'200',
'ദ്'=>'200',
'ധ്'=>'200',
'ന്'=>'200',
'പ്'=>'200',
'ഫ്'=>'200',
'ബ്'=>'200',
'ഭ്'=>'200',
'മ്'=>'200',
'യ്'=>'200',
'ര്'=>'200',
'റ്'=>'200',
'ല്'=>'200',
'ള്'=>'200',
'ഴ്'=>'200',
'വ്'=>'200',
'ശ്'=>'200',
'ഷ്'=>'200',
'സ്'=>'200',
'ഹ്'=>'200',
'ര്ക'=>'2000',
'ര്ട'=>'2000',
'ര്ത'=>'2000',
'ര്പ'=>'2000',
'క్'=>'200',
'ఖ్'=>'200',
'గ్'=>'200',
'ఘ్'=>'200',
'ఙ్'=>'200',
'చ్'=>'200',
'ఛ్'=>'200',
'జ్'=>'200',
'ఝ్'=>'200',
'ఞ్'=>'200',
'ట్'=>'200',
'ఠ్'=>'200',
'డ్'=>'200',
'ఢ్'=>'200',
'ణ్'=>'200',
'త్'=>'200',
'థ్'=>'200',
'ద్'=>'200',
'ధ్'=>'200',
'న్'=>'200',
'ప్'=>'200',
'ఫ్'=>'200',
'బ్'=>'200',
'భ్'=>'200',
'మ్'=>'200',
'య్'=>'200',
'ర్'=>'200',
'ఱ్'=>'200',
'ల్'=>'200',
'ళ్'=>'200',
'వ్'=>'200',
'శ్'=>'200',
'ష్'=>'200',
'స్'=>'200',
'హ్'=>'200',
'ర్క్'=>'20000',
'ర్ట్'=>'20000',
'ర్త్'=>'20000',
'ర్ప్'=>'20000'
),
'all'=>array(
''=>'22',
''=>'22',
'अ'=>'11',
'आ'=>'11',
'इ'=>'11',
'ई'=>'11',
'उ'=>'11',
'ऊ'=>'11',
'ऋ'=>'11',
'ॠ'=>'11',
'ऌ'=>'11',
'ॡ'=>'11',
'ए'=>'11',
'ऐ'=>'11',
'ओ'=>'11',
'औ'=>'11',
'ा'=>'21',
'ि'=>'21',
'ी'=>'21',
'ु'=>'21',
'ू'=>'21',
'ृ'=>'21',
'ॄ'=>'21',
'ॢ'=>'21',
'ॣ'=>'21',
'े'=>'21',
'ै'=>'21',
'ो'=>'21',
'ौ'=>'21',
'क'=>'11',
'ख'=>'11',
'ग'=>'11',
'घ'=>'11',
'ङ'=>'11',
'च'=>'11',
'छ'=>'11',
'ज'=>'11',
'झ'=>'11',
'ञ'=>'11',
'ट'=>'11',
'ठ'=>'11',
'ड'=>'11',
'ढ'=>'11',
'ण'=>'11',
'त'=>'11',
'थ'=>'11',
'द'=>'11',
'ध'=>'11',
'न'=>'11',
'प'=>'11',
'फ'=>'11',
'ब'=>'11',
'भ'=>'11',
'म'=>'11',
'य'=>'11',
'र'=>'11',
'ल'=>'11',
'ळ'=>'11',
'व'=>'11',
'श'=>'11',
'ष'=>'11',
'स'=>'11',
'ह'=>'11',
'ँ'=>'20',
'ं'=>'20',
''=>'20',
'ऽ'=>'22',
'॑'=>'20',
'॒'=>'20',
'्'=>'22',
'অ'=>'11',
'আ'=>'11',
'ই'=>'11',
'ঈ'=>'11',
'উ'=>'11',
'ঊ'=>'11',
'ঋ'=>'11',
'ৠ'=>'11',
'ঌ'=>'11',
'ৡ'=>'11',
'এ'=>'11',
'ঐ'=>'11',
'ও'=>'11',
'ঔ'=>'11',
'া'=>'21',
'ি'=>'21',
'ী'=>'21',
'ু'=>'21',
'ূ'=>'21',
'ৃ'=>'21',
'ৄ'=>'21',
'ৢ'=>'21',
'ৣ'=>'21',
'ে'=>'21',
'ৈ'=>'21',
'ো'=>'21',
'ৌ'=>'21',
'ক'=>'11',
'খ'=>'11',
'গ'=>'11',
'ঘ'=>'11',
'ঙ'=>'11',
'চ'=>'11',
'ছ'=>'11',
'জ'=>'11',
'ঝ'=>'11',
'ঞ'=>'11',
'ট'=>'11',
'ঠ'=>'11',
'ড'=>'11',
'ড়'=>'101',
'ঢ'=>'11',
'ঢ়'=>'101',
'ণ'=>'11',
'ত'=>'11',
'থ'=>'11',
'দ'=>'11',
'ধ'=>'11',
'ন'=>'11',
'প'=>'11',
'ফ'=>'11',
'ব'=>'11',
'ভ'=>'11',
'ম'=>'11',
'য'=>'11',
'য়'=>'101',
'র'=>'11',
'ল'=>'11',
'শ'=>'11',
'ষ'=>'11',
'স'=>'11',
'হ'=>'11',
'ৎ'=>'12',
'ঁ'=>'20',
'ং'=>'20',
'ঃ'=>'20',
'ঽ'=>'22',
'়'=>'20',
'ৗ'=>'20',
'্'=>'22',
'અ'=>'11',
'આ'=>'11',
'ઇ'=>'11',
'ઈ'=>'11',
'ઉ'=>'11',
'ઊ'=>'11',
'ઋ'=>'11',
'ૠ'=>'11',
'ઌ'=>'11',
'ૡ'=>'11',
'એ'=>'11',
'ઐ'=>'11',
'ઓ'=>'11',
'ઔ'=>'11',
'ા'=>'21',
'િ'=>'21',
'ી'=>'21',
'ુ'=>'21',
'ૂ'=>'21',
'ૃ'=>'21',
'ૄ'=>'21',
'ૢ'=>'21',
'ૣ'=>'21',
'ે'=>'21',
'ૈ'=>'21',
'ો'=>'21',
'ૌ'=>'21',
'ક'=>'11',
'ખ'=>'11',
'ગ'=>'11',
'ઘ'=>'11',
'ઙ'=>'11',
'ચ'=>'11',
'છ'=>'11',
'જ'=>'11',
'ઝ'=>'11',
'ઞ'=>'11',
'ટ'=>'11',
'ઠ'=>'11',
'ડ'=>'11',
'ઢ'=>'11',
'ણ'=>'11',
'ત'=>'11',
'થ'=>'11',
'દ'=>'11',
'ધ'=>'11',
'ન'=>'11',
'પ'=>'11',
'ફ'=>'11',
'બ'=>'11',
'ભ'=>'11',
'મ'=>'11',
'ય'=>'11',
'ર'=>'11',
'લ'=>'11',
'ળ'=>'11',
'વ'=>'11',
'શ'=>'11',
'ષ'=>'11',
'સ'=>'11',
'હ'=>'11',
'ઁ'=>'20',
'ં'=>'20',
''=>'20',
'ઽ'=>'22',
'્'=>'22',
'ಅ'=>'11',
'ಆ'=>'11',
'ಇ'=>'11',
'ಈ'=>'11',
'ಉ'=>'11',
'ಊ'=>'11',
'ಋ'=>'11',
'ೠ'=>'11',
'ಌ'=>'11',
'ೡ'=>'11',
'ಎ'=>'11',
'ಏ'=>'11',
'ಐ'=>'11',
'ಒ'=>'11',
'ಓ'=>'11',
'ಔ'=>'11',
'ಾ'=>'21',
'ಿ'=>'21',
'ೀ'=>'21',
'ು'=>'21',
'ೂ'=>'21',
'ೃ'=>'21',
'ೄ'=>'21',
'ೆ'=>'21',
'ೇ'=>'21',
'ೈ'=>'21',
'ೊ'=>'21',
'ೋ'=>'21',
'ೌ'=>'21',
'ಕ'=>'11',
'ಖ'=>'11',
'ಗ'=>'11',
'ಘ'=>'11',
'ಙ'=>'11',
'ಚ'=>'11',
'ಛ'=>'11',
'ಜ'=>'11',
'ಝ'=>'11',
'ಞ'=>'11',
'ಟ'=>'11',
'ಠ'=>'11',
'ಡ'=>'11',
'ಢ'=>'11',
'ಣ'=>'11',
'ತ'=>'11',
'ಥ'=>'11',
'ದ'=>'11',
'ಧ'=>'11',
'ನ'=>'11',
'ಪ'=>'11',
'ಫ'=>'11',
'ಬ'=>'11',
'ಭ'=>'11',
'ಮ'=>'11',
'ಯ'=>'11',
'ರ'=>'11',
'ಱ'=>'11',
'ಲ'=>'11',
'ಳ'=>'11',
'ೞ'=>'11',
'ವ'=>'11',
'ಶ'=>'11',
'ಷ'=>'11',
'ಸ'=>'11',
'ಹ'=>'11',
''=>'20',
'ಃ'=>'20',
'ಽ'=>'22',
'ೕ'=>'20',
'ೖ'=>'20',
'್'=>'22',
'അ'=>'11',
'ആ'=>'11',
'ഇ'=>'11',
'ഈ'=>'11',
'ഉ'=>'11',
'ഊ'=>'11',
'ഋ'=>'11',
'ൠ'=>'11',
'ഌ'=>'11',
'ൡ'=>'11',
'എ'=>'11',
'ഏ'=>'11',
'ഐ'=>'11',
'ഒ'=>'11',
'ഓ'=>'11',
'ഔ'=>'11',
'ാ'=>'21',
'ി'=>'21',
'ീ'=>'21',
'ു'=>'21',
'ൂ'=>'21',
'ൃ'=>'21',
'െ'=>'21',
'േ'=>'21',
'ൈ'=>'21',
'ൊ'=>'21',
'ോ'=>'21',
'ൌ'=>'21',
'ക'=>'11',
'ഖ'=>'11',
'ഗ'=>'11',
'ഘ'=>'11',
'ങ'=>'11',
'ച'=>'11',
'ഛ'=>'11',
'ജ'=>'11',
'ഝ'=>'11',
'ഞ'=>'11',
'ട'=>'11',
''=>'11',
'ഡ'=>'11',
'ഢ'=>'11',
'ണ'=>'11',
'ത'=>'11',
'ഥ'=>'11',
'ദ'=>'11',
'ധ'=>'11',
'ന'=>'11',
'പ'=>'11',
'ഫ'=>'11',
'ബ'=>'11',
'ഭ'=>'11',
'മ'=>'11',
'യ'=>'11',
'ര'=>'11',
'റ'=>'11',
'ല'=>'11',
'ള'=>'11',
'ഴ'=>'11',
'വ'=>'11',
'ശ'=>'11',
'ഷ'=>'11',
'സ'=>'11',
'ഹ'=>'11',
''=>'20',
'ഃ'=>'20',
'ൗ'=>'20',
'്'=>'22',
'అ'=>'11',
'ఆ'=>'11',
'ఇ'=>'11',
'ఈ'=>'11',
'ఉ'=>'11',
'ఊ'=>'11',
'ఋ'=>'11',
'ౠ'=>'11',
'ఌ'=>'11',
'ౡ'=>'11',
'ఎ'=>'11',
'ఏ'=>'11',
'ఐ'=>'11',
'ఒ'=>'11',
'ఓ'=>'11',
'ఔ'=>'11',
'ా'=>'21',
'ి'=>'21',
'ీ'=>'21',
'ు'=>'21',
'ూ'=>'21',
'ృ'=>'21',
'ౄ'=>'21',
'ె'=>'21',
'ే'=>'21',
'ై'=>'21',
'ొ'=>'21',
'ో'=>'21',
'ౌ'=>'21',
'క'=>'11',
'ఖ'=>'11',
'గ'=>'11',
'ఘ'=>'11',
'ఙ'=>'11',
'చ'=>'11',
'ఛ'=>'11',
'జ'=>'11',
'ఝ'=>'11',
'ఞ'=>'11',
'ట'=>'11',
'ఠ'=>'11',
'డ'=>'11',
'ఢ'=>'11',
'ణ'=>'11',
'త'=>'11',
'థ'=>'11',
'ద'=>'11',
'ధ'=>'11',
'న'=>'11',
'ప'=>'11',
'ఫ'=>'11',
'బ'=>'11',
'భ'=>'11',
'మ'=>'11',
'య'=>'11',
'ర'=>'11',
'ఱ'=>'11',
'ల'=>'11',
'ళ'=>'11',
'వ'=>'11',
'శ'=>'11',
'ష'=>'11',
'స'=>'11',
'హ'=>'11',
'ఁ'=>'20',
''=>'20',
'ః'=>'20',
'ౕ'=>'20',
'ౖ'=>'20',
'్'=>'22'
)
);
?>

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

2586
php-typography/lang/sk.php Normal file

File diff suppressed because it is too large Load Diff

1193
php-typography/lang/sl.php Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

4835
php-typography/lang/sv.php Normal file

File diff suppressed because it is too large Load Diff

659
php-typography/lang/tr.php Normal file
View File

@ -0,0 +1,659 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-tr.tex
//============================================================================================================
ORIGINAL FILE INFO
% hyph-tr.tex
%
% File auto-generated from generate_patterns_tr.rb that is part of hyph-utf8
%
% Licence:
% - Free enough for Debian & TeX Live or any other distributor
% - If you are reading this and have some suggestion about what to put here, please advise
% - I (Mojca) would prefer to say 'public domain', but don't know what it means for the original authors
%
% Credits:
% - algorithm developed by P. A. MacKay for the Ottoman Texts Project in 1987
% - rules adapted for modern Turkish by H. Turgut Uyar <uyar at itu.edu.tr>
% - initiative to improve Turkish patterns by S. Ekin Kocabas <kocabas at stanford.edu>
% - script written by Mojca Miklavec <mojca.miklavec.lists at gmail.com> in June 2008
%
% Modifications:
% - adapted for the use on modern UTF-8 TeX engines
% - UTF-8 patterns are used
% - only letters for Modern Turkish + âîû (the first one often needed, the other two don't hurt)
% - if needed, support for Ottoman Turkish might be provided separately under 'ota' (not 'tr')
%
% Notes:
% - you need to use loadhyph-tr.tex, please do not try to put \catcode-s & \lccode-s here
//============================================================================================================
*/
$patgenLanguage = 'Turkish';
$patgenExceptions = array();
$patgenMaxSeg = 10;
$patgen = array(
'begin'=>array(),
'end'=>array(
'ecek'=>'22000'
),
'all'=>array(
'a'=>'21',
'â'=>'21',
'e'=>'21',
'ı'=>'21',
'i'=>'21',
'î'=>'21',
'o'=>'21',
'ö'=>'21',
'u'=>'21',
'ü'=>'21',
'û'=>'21',
'%'=>'00',
'allow'=>'000000',
'hyphen'=>'0000000',
'either'=>'0000000',
'side'=>'00000',
'of'=>'000',
'consonants'=>'00000000000',
'b'=>'11',
'c'=>'11',
'ç'=>'11',
'd'=>'11',
'f'=>'11',
'g'=>'11',
'ğ'=>'11',
'h'=>'11',
'j'=>'11',
'k'=>'11',
'l'=>'11',
'm'=>'11',
'n'=>'11',
'p'=>'11',
'r'=>'11',
's'=>'11',
'ş'=>'11',
't'=>'11',
'v'=>'11',
'y'=>'11',
'z'=>'11',
'bb'=>'200',
'bc'=>'200',
'bç'=>'200',
'bd'=>'200',
'bf'=>'200',
'bg'=>'200',
'bğ'=>'200',
'bh'=>'200',
'bj'=>'200',
'bk'=>'200',
'bl'=>'200',
'bm'=>'200',
'bn'=>'200',
'bp'=>'200',
'br'=>'200',
'bs'=>'200',
'bş'=>'200',
'bt'=>'200',
'bv'=>'200',
'by'=>'200',
'bz'=>'200',
'cb'=>'200',
'cc'=>'200',
'cç'=>'200',
'cd'=>'200',
'cf'=>'200',
'cg'=>'200',
'cğ'=>'200',
'ch'=>'200',
'cj'=>'200',
'ck'=>'200',
'cl'=>'200',
'cm'=>'200',
'cn'=>'200',
'cp'=>'200',
'cr'=>'200',
'cs'=>'200',
'cş'=>'200',
'ct'=>'200',
'cv'=>'200',
'cy'=>'200',
'cz'=>'200',
'çb'=>'200',
'çc'=>'200',
'çç'=>'200',
'çd'=>'200',
'çf'=>'200',
'çg'=>'200',
'çğ'=>'200',
'çh'=>'200',
'çj'=>'200',
'çk'=>'200',
'çl'=>'200',
'çm'=>'200',
'çn'=>'200',
'çp'=>'200',
'çr'=>'200',
'çs'=>'200',
'çş'=>'200',
'çt'=>'200',
'çv'=>'200',
'çy'=>'200',
'çz'=>'200',
'db'=>'200',
'dc'=>'200',
'dç'=>'200',
'dd'=>'200',
'df'=>'200',
'dg'=>'200',
'dğ'=>'200',
'dh'=>'200',
'dj'=>'200',
'dk'=>'200',
'dl'=>'200',
'dm'=>'200',
'dn'=>'200',
'dp'=>'200',
'dr'=>'200',
'ds'=>'200',
'dş'=>'200',
'dt'=>'200',
'dv'=>'200',
'dy'=>'200',
'dz'=>'200',
'fb'=>'200',
'fc'=>'200',
'fç'=>'200',
'fd'=>'200',
'ff'=>'200',
'fg'=>'200',
'fğ'=>'200',
'fh'=>'200',
'fj'=>'200',
'fk'=>'200',
'fl'=>'200',
'fm'=>'200',
'fn'=>'200',
'fp'=>'200',
'fr'=>'200',
'fs'=>'200',
'fş'=>'200',
'ft'=>'200',
'fv'=>'200',
'fy'=>'200',
'fz'=>'200',
'gb'=>'200',
'gc'=>'200',
'gç'=>'200',
'gd'=>'200',
'gf'=>'200',
'gg'=>'200',
'gğ'=>'200',
'gh'=>'200',
'gj'=>'200',
'gk'=>'200',
'gl'=>'200',
'gm'=>'200',
'gn'=>'200',
'gp'=>'200',
'gr'=>'200',
'gs'=>'200',
'gş'=>'200',
'gt'=>'200',
'gv'=>'200',
'gy'=>'200',
'gz'=>'200',
'ğb'=>'200',
'ğc'=>'200',
'ğç'=>'200',
'ğd'=>'200',
'ğf'=>'200',
'ğg'=>'200',
'ğğ'=>'200',
'ğh'=>'200',
'ğj'=>'200',
'ğk'=>'200',
'ğl'=>'200',
'ğm'=>'200',
'ğn'=>'200',
'ğp'=>'200',
'ğr'=>'200',
'ğs'=>'200',
'ğş'=>'200',
'ğt'=>'200',
'ğv'=>'200',
'ğy'=>'200',
'ğz'=>'200',
'hb'=>'200',
'hc'=>'200',
'hç'=>'200',
'hd'=>'200',
'hf'=>'200',
'hg'=>'200',
'hğ'=>'200',
'hh'=>'200',
'hj'=>'200',
'hk'=>'200',
'hl'=>'200',
'hm'=>'200',
'hn'=>'200',
'hp'=>'200',
'hr'=>'200',
'hs'=>'200',
'hş'=>'200',
'ht'=>'200',
'hv'=>'200',
'hy'=>'200',
'hz'=>'200',
'jb'=>'200',
'jc'=>'200',
'jç'=>'200',
'jd'=>'200',
'jf'=>'200',
'jg'=>'200',
'jğ'=>'200',
'jh'=>'200',
'jj'=>'200',
'jk'=>'200',
'jl'=>'200',
'jm'=>'200',
'jn'=>'200',
'jp'=>'200',
'jr'=>'200',
'js'=>'200',
'jş'=>'200',
'jt'=>'200',
'jv'=>'200',
'jy'=>'200',
'jz'=>'200',
'kb'=>'200',
'kc'=>'200',
'kç'=>'200',
'kd'=>'200',
'kf'=>'200',
'kg'=>'200',
'kğ'=>'200',
'kh'=>'200',
'kj'=>'200',
'kk'=>'200',
'kl'=>'200',
'km'=>'200',
'kn'=>'200',
'kp'=>'200',
'kr'=>'200',
'ks'=>'200',
'kş'=>'200',
'kt'=>'200',
'kv'=>'200',
'ky'=>'200',
'kz'=>'200',
'lb'=>'200',
'lc'=>'200',
'lç'=>'200',
'ld'=>'200',
'lf'=>'200',
'lg'=>'200',
'lğ'=>'200',
'lh'=>'200',
'lj'=>'200',
'lk'=>'200',
'll'=>'200',
'lm'=>'200',
'ln'=>'200',
'lp'=>'200',
'lr'=>'200',
'ls'=>'200',
'lş'=>'200',
'lt'=>'200',
'lv'=>'200',
'ly'=>'200',
'lz'=>'200',
'mb'=>'200',
'mc'=>'200',
'mç'=>'200',
'md'=>'200',
'mf'=>'200',
'mg'=>'200',
'mğ'=>'200',
'mh'=>'200',
'mj'=>'200',
'mk'=>'200',
'ml'=>'200',
'mm'=>'200',
'mn'=>'200',
'mp'=>'200',
'mr'=>'200',
'ms'=>'200',
'mş'=>'200',
'mt'=>'200',
'mv'=>'200',
'my'=>'200',
'mz'=>'200',
'nb'=>'200',
'nc'=>'200',
'nç'=>'200',
'nd'=>'200',
'nf'=>'200',
'ng'=>'200',
'nğ'=>'200',
'nh'=>'200',
'nj'=>'200',
'nk'=>'200',
'nl'=>'200',
'nm'=>'200',
'nn'=>'200',
'np'=>'200',
'nr'=>'200',
'ns'=>'200',
'nş'=>'200',
'nt'=>'200',
'nv'=>'200',
'ny'=>'200',
'nz'=>'200',
'pb'=>'200',
'pc'=>'200',
'pç'=>'200',
'pd'=>'200',
'pf'=>'200',
'pg'=>'200',
'pğ'=>'200',
'ph'=>'200',
'pj'=>'200',
'pk'=>'200',
'pl'=>'200',
'pm'=>'200',
'pn'=>'200',
'pp'=>'200',
'pr'=>'200',
'ps'=>'200',
'pş'=>'200',
'pt'=>'200',
'pv'=>'200',
'py'=>'200',
'pz'=>'200',
'rb'=>'200',
'rc'=>'200',
'rç'=>'200',
'rd'=>'200',
'rf'=>'200',
'rg'=>'200',
'rğ'=>'200',
'rh'=>'200',
'rj'=>'200',
'rk'=>'200',
'rl'=>'200',
'rm'=>'200',
'rn'=>'200',
'rp'=>'200',
'rr'=>'200',
'rs'=>'200',
'rş'=>'200',
'rt'=>'200',
'rv'=>'200',
'ry'=>'200',
'rz'=>'200',
'sb'=>'200',
'sc'=>'200',
'sç'=>'200',
'sd'=>'200',
'sf'=>'200',
'sg'=>'200',
'sğ'=>'200',
'sh'=>'200',
'sj'=>'200',
'sk'=>'200',
'sl'=>'200',
'sm'=>'200',
'sn'=>'200',
'sp'=>'200',
'sr'=>'200',
'ss'=>'200',
'sş'=>'200',
'st'=>'200',
'sv'=>'200',
'sy'=>'200',
'sz'=>'200',
'şb'=>'200',
'şc'=>'200',
'şç'=>'200',
'şd'=>'200',
'şf'=>'200',
'şg'=>'200',
'şğ'=>'200',
'şh'=>'200',
'şj'=>'200',
'şk'=>'200',
'şl'=>'200',
'şm'=>'200',
'şn'=>'200',
'şp'=>'200',
'şr'=>'200',
'şs'=>'200',
'şş'=>'200',
'şt'=>'200',
'şv'=>'200',
'şy'=>'200',
'şz'=>'200',
'tb'=>'200',
'tc'=>'200',
'tç'=>'200',
'td'=>'200',
'tf'=>'200',
'tg'=>'200',
'tğ'=>'200',
'th'=>'200',
'tj'=>'200',
'tk'=>'200',
'tl'=>'200',
'tm'=>'200',
'tn'=>'200',
'tp'=>'200',
'tr'=>'200',
'ts'=>'200',
'tş'=>'200',
'tt'=>'200',
'tv'=>'200',
'ty'=>'200',
'tz'=>'200',
'vb'=>'200',
'vc'=>'200',
'vç'=>'200',
'vd'=>'200',
'vf'=>'200',
'vg'=>'200',
'vğ'=>'200',
'vh'=>'200',
'vj'=>'200',
'vk'=>'200',
'vl'=>'200',
'vm'=>'200',
'vn'=>'200',
'vp'=>'200',
'vr'=>'200',
'vs'=>'200',
'vş'=>'200',
'vt'=>'200',
'vv'=>'200',
'vy'=>'200',
'vz'=>'200',
'yb'=>'200',
'yc'=>'200',
'yç'=>'200',
'yd'=>'200',
'yf'=>'200',
'yg'=>'200',
'yğ'=>'200',
'yh'=>'200',
'yj'=>'200',
'yk'=>'200',
'yl'=>'200',
'ym'=>'200',
'yn'=>'200',
'yp'=>'200',
'yr'=>'200',
'ys'=>'200',
'yş'=>'200',
'yt'=>'200',
'yv'=>'200',
'yy'=>'200',
'yz'=>'200',
'zb'=>'200',
'zc'=>'200',
'zç'=>'200',
'zd'=>'200',
'zf'=>'200',
'zg'=>'200',
'zğ'=>'200',
'zh'=>'200',
'zj'=>'200',
'zk'=>'200',
'zl'=>'200',
'zm'=>'200',
'zn'=>'200',
'zp'=>'200',
'zr'=>'200',
'zs'=>'200',
'zş'=>'200',
'zt'=>'200',
'zv'=>'200',
'zy'=>'200',
'zz'=>'200',
'aa'=>'032',
'aâ'=>'032',
'ae'=>'032',
'aı'=>'032',
'ai'=>'032',
'aî'=>'032',
'ao'=>'032',
'aö'=>'032',
'au'=>'032',
'aü'=>'032',
'aû'=>'032',
'âa'=>'032',
'ââ'=>'032',
'âe'=>'032',
'âı'=>'032',
'âi'=>'032',
'âî'=>'032',
'âo'=>'032',
'âö'=>'032',
'âu'=>'032',
'âü'=>'032',
'âû'=>'032',
'ea'=>'032',
'eâ'=>'032',
'ee'=>'032',
'eı'=>'032',
'ei'=>'032',
'eî'=>'032',
'eo'=>'032',
'eö'=>'032',
'eu'=>'032',
'eü'=>'032',
'eû'=>'032',
'ıa'=>'032',
'ıâ'=>'032',
'ıe'=>'032',
'ıı'=>'032',
'ıi'=>'032',
'ıî'=>'032',
'ıo'=>'032',
'ıö'=>'032',
'ıu'=>'032',
'ıü'=>'032',
'ıû'=>'032',
'ia'=>'032',
'iâ'=>'032',
'ie'=>'032',
'iı'=>'032',
'ii'=>'032',
'iî'=>'032',
'io'=>'032',
'iö'=>'032',
'iu'=>'032',
'iü'=>'032',
'iû'=>'032',
'îa'=>'032',
'îâ'=>'032',
'îe'=>'032',
'îı'=>'032',
'îi'=>'032',
'îî'=>'032',
'îo'=>'032',
'îö'=>'032',
'îu'=>'032',
'îü'=>'032',
'îû'=>'032',
'oa'=>'032',
'oâ'=>'032',
'oe'=>'032',
'oı'=>'032',
'oi'=>'032',
'oî'=>'032',
'oo'=>'032',
'oö'=>'032',
'ou'=>'032',
'oü'=>'032',
'oû'=>'032',
'öa'=>'032',
'öâ'=>'032',
'öe'=>'032',
'öı'=>'032',
'öi'=>'032',
'öî'=>'032',
'öo'=>'032',
'öö'=>'032',
'öu'=>'032',
'öü'=>'032',
'öû'=>'032',
'ua'=>'032',
'uâ'=>'032',
'ue'=>'032',
'uı'=>'032',
'ui'=>'032',
'uî'=>'032',
'uo'=>'032',
'uö'=>'032',
'uu'=>'032',
'uü'=>'032',
'uû'=>'032',
'üa'=>'032',
'üâ'=>'032',
'üe'=>'032',
'üı'=>'032',
'üi'=>'032',
'üî'=>'032',
'üo'=>'032',
'üö'=>'032',
'üu'=>'032',
'üü'=>'032',
'üû'=>'032',
'ûa'=>'032',
'ûâ'=>'032',
'ûe'=>'032',
'ûı'=>'032',
'ûi'=>'032',
'ûî'=>'032',
'ûo'=>'032',
'ûö'=>'032',
'ûu'=>'032',
'ûü'=>'032',
'ûû'=>'032',
'turk'=>'00440',
'mtrak'=>'014000'
)
);
?>

2092
php-typography/lang/uk.php Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,309 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-zh-latn.tex
//============================================================================================================
ORIGINAL FILE INFO
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: pyhyph.tex (yyyy-mm-dd)
% Author: Werner Lemberg <wl at gnu.org>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the 'copyright/copyleft' owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% 'a better world' as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole 'package' to CTAN.
%
% Before a new 'pattern-revolution' starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only 'allowed' TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
% This is the file pyhyph.tex of the CJK package
% for hyphenating Chinese pinyin syllables.
%
% created by Werner Lemberg <wl@gnu.org>
%
% Version 4.8.0 (22-May-2008)
%
% Copyright (C) 1994-2008 Werner Lemberg <wl@gnu.org>
%
% This program is free software; you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation; either version 2 of the License, or
% (at your option) any later version.
%
% This program is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with this program in doc/COPYING; if not, write to the Free
% Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
% MA 02110-1301 USA
%
% \message{Hyphenation patterns for unaccented pinyin syllables (CJK 4.7.0)}
//============================================================================================================
*/
$patgenLanguage = 'Chinese pinyin (Latin)';
$patgenExceptions = array();
$patgenMaxSeg = 2;
$patgen = array(
'begin'=>array(),
'end'=>array(),
'all'=>array(
'ab'=>'010',
'ac'=>'010',
'ad'=>'010',
'af'=>'010',
'ag'=>'010',
'ah'=>'010',
'aj'=>'010',
'ak'=>'010',
'al'=>'010',
'am'=>'010',
'ap'=>'010',
'aq'=>'010',
'ar'=>'010',
'as'=>'010',
'at'=>'010',
'aw'=>'010',
'ax'=>'010',
'ay'=>'010',
'az'=>'010',
'eb'=>'010',
'ec'=>'010',
'ed'=>'010',
'ef'=>'010',
'eg'=>'010',
'eh'=>'010',
'ej'=>'010',
'ek'=>'010',
'el'=>'010',
'em'=>'010',
'ep'=>'010',
'eq'=>'010',
'es'=>'010',
'et'=>'010',
'ew'=>'010',
'ex'=>'010',
'ey'=>'010',
'ez'=>'010',
'ga'=>'100',
'gb'=>'010',
'gc'=>'010',
'gd'=>'010',
'ge'=>'100',
'gf'=>'010',
'gg'=>'010',
'gh'=>'010',
'gj'=>'010',
'gk'=>'010',
'gl'=>'010',
'gm'=>'010',
'gn'=>'010',
'go'=>'100',
'gp'=>'010',
'gq'=>'010',
'gr'=>'010',
'gs'=>'010',
'gt'=>'010',
'gu'=>'100',
'gw'=>'010',
'gx'=>'010',
'gy'=>'010',
'gz'=>'010',
'ib'=>'010',
'ic'=>'010',
'id'=>'010',
'if'=>'010',
'ig'=>'010',
'ih'=>'010',
'ij'=>'010',
'ik'=>'010',
'il'=>'010',
'im'=>'010',
'ip'=>'010',
'iq'=>'010',
'ir'=>'010',
'is'=>'010',
'it'=>'010',
'iw'=>'010',
'ix'=>'010',
'iy'=>'010',
'iz'=>'010',
'na'=>'100',
'nb'=>'010',
'nc'=>'010',
'nd'=>'010',
'ne'=>'100',
'nf'=>'010',
'nh'=>'010',
'ni'=>'100',
'nj'=>'010',
'nk'=>'010',
'nl'=>'010',
'nm'=>'010',
'nn'=>'010',
'no'=>'100',
'np'=>'010',
'nq'=>'010',
'nr'=>'010',
'ns'=>'010',
'nt'=>'010',
'nu'=>'100',
'nü'=>'100',
'nw'=>'010',
'nx'=>'010',
'ny'=>'010',
'nz'=>'010',
'ob'=>'010',
'oc'=>'010',
'od'=>'010',
'of'=>'010',
'og'=>'010',
'oh'=>'010',
'oj'=>'010',
'ok'=>'010',
'ol'=>'010',
'om'=>'010',
'op'=>'010',
'oq'=>'010',
'or'=>'010',
'os'=>'010',
'ot'=>'010',
'ow'=>'010',
'ox'=>'010',
'oy'=>'010',
'oz'=>'010',
'ra'=>'100',
'rb'=>'010',
'rc'=>'010',
'rd'=>'010',
're'=>'100',
'rf'=>'010',
'rg'=>'010',
'rh'=>'010',
'ri'=>'100',
'rj'=>'010',
'rk'=>'010',
'rl'=>'010',
'rm'=>'010',
'rn'=>'010',
'ro'=>'100',
'rp'=>'010',
'rq'=>'010',
'rr'=>'010',
'rs'=>'010',
'rt'=>'010',
'ru'=>'100',
'rw'=>'010',
'rx'=>'010',
'ry'=>'010',
'rz'=>'010',
'ub'=>'010',
'uc'=>'010',
'ud'=>'010',
'uf'=>'010',
'ug'=>'010',
'uh'=>'010',
'uj'=>'010',
'uk'=>'010',
'ul'=>'010',
'um'=>'010',
'up'=>'010',
'uq'=>'010',
'ur'=>'010',
'us'=>'010',
'ut'=>'010',
'uw'=>'010',
'ux'=>'010',
'uy'=>'010',
'uz'=>'010',
'üb'=>'010',
'üc'=>'010',
'üd'=>'010',
'üf'=>'010',
'üg'=>'010',
'üh'=>'010',
'üj'=>'010',
'ük'=>'010',
'ül'=>'010',
'üm'=>'010',
'ün'=>'010',
'üp'=>'010',
'üq'=>'010',
'ür'=>'010',
'üs'=>'010',
'üt'=>'010',
'üw'=>'010',
'üx'=>'010',
'üy'=>'010',
'üz'=>'010',
"'a"=>'010',
"'e"=>'010',
"'o"=>'010'
)
);
?>

View File

@ -0,0 +1,252 @@
% This file is part of hyph-utf8 package and resulted from
% semi-manual conversions of hyphenation patterns into UTF-8 in June 2008.
%
% Source: lahyph.tex (2007-09-03)
% Author: Claudio Beccari <claudio.beccari at polito.it>
%
% The above mentioned file should become obsolete,
% and the author of the original file should preferaby modify this file instead.
%
% Modificatios were needed in order to support native UTF-8 engines,
% but functionality (hopefully) didn't change in any way, at least not intentionally.
% This file is no longer stand-alone; at least for 8-bit engines
% you probably want to use loadhyph-foo.tex (which will load this file) instead.
%
% Modifications were done by Jonathan Kew, Mojca Miklavec & Arthur Reutenauer
% with help & support from:
% - Karl Berry, who gave us free hands and all resources
% - Taco Hoekwater, with useful macros
% - Hans Hagen, who did the unicodifisation of patterns already long before
% and helped with testing, suggestions and bug reports
% - Norbert Preining, who tested & integrated patterns into TeX Live
%
% However, the "copyright/copyleft" owner of patterns remains the original author.
%
% The copyright statement of this file is thus:
%
% Do with this file whatever needs to be done in future for the sake of
% "a better world" as long as you respect the copyright of original file.
% If you're the original author of patterns or taking over a new revolution,
% plese remove all of the TUG comments & credits that we added here -
% you are the Queen / the King, we are only the servants.
%
% If you want to change this file, rather than uploading directly to CTAN,
% we would be grateful if you could send it to us (http://tug.org/tex-hyphen)
% or ask for credentials for SVN repository and commit it yourself;
% we will then upload the whole "package" to CTAN.
%
% Before a new "pattern-revolution" starts,
% please try to follow some guidelines if possible:
%
% - \lccode is *forbidden*, and I really mean it
% - all the patterns should be in UTF-8
% - the only "allowed" TeX commands in this file are: \patterns, \hyphenation,
% and if you really cannot do without, also \input and \message
% - in particular, please no \catcode or \lccode changes,
% they belong to loadhyph-foo.tex,
% and no \lefthyphenmin and \righthyphenmin,
% they have no influence here and belong elsewhere
% - \begingroup and/or \endinput is not needed
% - feel free to do whatever you want inside comments
%
% We know that TeX is extremely powerful, but give a stupid parser
% at least a chance to read your patterns.
%
% For more unformation see
%
% http://tug.org/tex-hyphen
%
%------------------------------------------------------------------------------
%
% ********** lahyph.tex *************
%
% Copyright 1999- 2001 Claudio Beccari
% [latin hyphenation patterns]
%
% -----------------------------------------------------------------
% IMPORTANT NOTICE:
%
% This program can be redistributed and/or modified under the terms
% of the LaTeX Project Public License Distributed from CTAN
% archives in directory macros/latex/base/lppl.txt; either
% version 1 of the License, or any later version.
% -----------------------------------------------------------------
%
% Patterns for the latin language mainly in modern spelling
% (u when u is needed and v when v is needed); medieval spelling
% with the ligatures \ae and \oe and the (uncial) lowercase `v'
% written as a `u' is also supported; apparently there is no conflict
% between the patterns of modern Latin and those of medieval Latin.
%
% Support for font encoding OT1 with 128-character set and
% for font encoding T1 with a 256-character set.
%
% Prepared by Claudio Beccari
% Politecnico di Torino
% Torino, Italy
% e-mail beccari@polito.it
%
% 1999/03/10 Integration of `lahyph7.tex' and `lahyph8.tex' into
% one file `lahyph.tex' supporting fonts in OT1 and T1 encoding by
% Bernd Raichle using the macro code from `dehypht.tex' (this code
% is Copyright 1993,1994,1998,1999 Bernd Raichle/DANTE e.V.).
%
%
% \versionnumber{3.1} \versiondate{2007/04/16}
%
% Information after \endinput.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% \message{Latin Hyphenation Patterns `lahyph' Version 3.1 <2007/04/16>}
%
%
\patterns{%
2'2
.a2b3l
.anti1 .anti3m2n
.circu2m1
.co2n1iun
.di2s3cine
.e2x1
.o2b3
.para1i .para1u
.su2b3lu .su2b3r
2s3que. 2s3dem.
3p2sic
3p2neu
æ1 œ1
a1ia a1ie a1io a1iu ae1a ae1o ae1u
e1iu
io1i
o1ia o1ie o1io o1iu
uo3u
1b 2bb 2bd b2l 2bm 2bn b2r 2bt 2bs 2b.
1c 2cc c2h2 c2l 2cm 2cn 2cq c2r 2cs 2ct 2cz 2c.
1d 2dd 2dg 2dm d2r 2ds 2dv 2d.
1f 2ff f2l 2fn f2r 2ft 2f.
1g 2gg 2gd 2gf g2l 2gm g2n g2r 2gs 2gv 2g.
1h 2hp 2ht 2h.
1j
1k 2kk k2h2
1l 2lb 2lc 2ld 2lf l3f2t 2lg 2lk 2ll 2lm 2ln 2lp 2lq 2lr
2ls 2lt 2lv 2l.
1m 2mm 2mb 2mp 2ml 2mn 2mq 2mr 2mv 2m.
1n 2nb 2nc 2nd 2nf 2ng 2nl 2nm 2nn 2np 2nq 2nr 2ns
n2s3m n2s3f 2nt 2nv 2nx 2n.
1p p2h p2l 2pn 2pp p2r 2ps 2pt 2pz 2php 2pht 2p.
1qu2
1r 2rb 2rc 2rd 2rf 2rg r2h 2rl 2rm 2rn 2rp 2rq 2rr 2rs 2rt
2rv 2rz 2r.
1s2 2s3ph 2s3s 2stb 2stc 2std 2stf 2stg 2st3l 2stm 2stn 2stp 2stq
2sts 2stt 2stv 2s. 2st.
1t 2tb 2tc 2td 2tf 2tg t2h t2l t2r 2tm 2tn 2tp 2tq 2tt
2tv 2t.
1v v2l v2r 2vv
1x 2xt 2xx 2x.
1z 2z.
% For medieval Latin
a1ua a1ue a1ui a1uo a1uu
e1ua e1ue e1ui e1uo e1uu
i1ua i1ue i1ui i1uo i1uu
o1ua o1ue o1ui o1uo o1uu
u1ua u1ue u1ui u1uo u1uu
%
a2l1ua a2l1ue a2l1ui a2l1uo a2l1uu
e2l1ua e2l1ue e2l1ui e2l1uo e2l1uu
i2l1ua i2l1ue i2l1ui i2l1uo i2l1uu
o2l1ua o2l1ue o2l1ui o2l1uo o2l1uu
u2l1ua u2l1ue u2l1ui u2l1uo u2l1uu
%
a2m1ua a2m1ue a2m1ui a2m1uo a2m1uu
e2m1ua e2m1ue e2m1ui e2m1uo e2m1uu
i2m1ua i2m1ue i2m1ui i2m1uo i2m1uu
o2m1ua o2m1ue o2m1ui o2m1uo o2m1uu
u2m1ua u2m1ue u2m1ui u2m1uo u2m1uu
%
a2n1ua a2n1ue a2n1ui a2n1uo a2n1uu
e2n1ua e2n1ue e2n1ui e2n1uo e2n1uu
i2n1ua i2n1ue i2n1ui i2n1uo i2n1uu
o2n1ua o2n1ue o2n1ui o2n1uo o2n1uu
u2n1ua u2n1ue u2n1ui u2n1uo u2n1uu
%
a2r1ua a2r1ue a2r1ui a2r1uo a2r1uu
e2r1ua e2r1ue e2r1ui e2r1uo e2r1uu
i2r1ua i2r1ue i2r1ui i2r1uo i2r1uu
o2r1ua o2r1ue o2r1ui o2r1uo o2r1uu
u2r1ua u2r1ue u2r1ui u2r1uo u2r1uu
%
%
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% For documentation see:
% C. Beccari, "Computer aided hyphenation for Italian and Modern
% Latin", TUG vol. 13, n. 1, pp. 23-33 (1992)
%
% see also
%
% C. Beccari, "Typesetting of ancient languages",
% TUG vol.15, n.1, pp. 9-16 (1994)
%
% In the former paper the code was described as being contained in file
% ITALAT.TEX; this is substantially the same code, but the file has been
% renamed LAHYPH.TEX in accordance with the ISO name for Latin and the
% convention that all hyphenation pattern file names should be formed by the
% agglutination of two letter language ISO code and the abbreviation HYPH.
%
% A corresponding file (ITHYPH.TEX) has been extracted in order to eliminate
% the (few) patterns specific to Latin and leave those specific to Italian;
% ITHYPH.TEX has been further extended with many new patterns in order to
% cope with the many neologisms and technical terms with foreign roots.
%
% Should you find any word that gets hyphenated in a wrong way, please, AFTER
% CHECKING ON A RELIABLE MODERN DICTIONARY, report to the author, preferably
% by e-mail. Please do not report about wrong break points concerning
% prefixes and/or suffixes; see at the bottom of this file.
%
% Compared with the previous versions, this file has been extended so as to
% cope also with the medieval Latin spelling, where the letter `V' played the
% roles of both `U' and `V', as in the Roman times, save that the Romans used
% only capitals. In the middle ages the availability of soft writing supports
% and the necessity of copying books with a reasonable speed, several scripts
% evolved in (practically) all of which there was a lower case alphabet
% different from the upper case one, and where the lower case `v' had the
% rounded shape of our modern lower case `u', and where the Latin diphthongs
% `AE' and `OE', both in upper and lower case, where written as ligatures,
% not to mention the habit of substituting them with their sound, that is a
% simple `E'.
%
% According to Leon Battista Alberti, who in 1466 wrote a book on
% cryptography where he thoroughly analyzed the hyphenation of the Latin
% language of his (still medieval) times, the differences from the Tuscan
% language (the Italian language, as it was named at his time) were very
% limited, in particular for what concerns the handling of the ascending and
% descending diphthongs; in Central and Northern Europe, and later on in
% North America, the Scholars perceived the above diphthongs as made of two
% distinct vowels; the hyphenation of medieval Latin, therefore, was quite
% different in the northern countries compared to the southern ones, at least
% for what concerns these diphthongs. If you need hyphenation patterns for
% medieval Latin that suite you better according to the habits of Northern
% Europe you should resort to the hyphenation patterns prepared by Yannis
% Haralambous (TUGboat, vol.13 n.4 (1992)).
%
%
%
% PREFIXES AND SUFFIXES
%
% For what concerns prefixes and suffixes, the latter are generally separated
% according to "natural" syllabification, while the former are generally
% divided etimologically. In order to avoid an excessive number of patterns,
% care has been paid to some prefixes, especially "ex", "trans", "circum",
% "prae", but this set of patterns is NOT capable of separating the prefixes
% in all circumstances.
%
% BABEL SHORTCUTS AND FACILITIES
%
% Read the documentation coming with the discription of the Latin language
% interface of Babel in order to see the shortcuts and the facilities
% introduced in order to facilitate the insertion of "compound word marks"
% which are very useful for inserting etimological break points.
%
% Happy Latin and multilingual typesetting!

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,63 @@
<?php
/*
Project: PHP Typography
Project URI: http://kingdesk.com/projects/php-typography/
File modified to place pattern and exceptions in arrays that can be understood in php files.
This file is released under the same copyright as the below referenced original file
Original unmodified file is available at: http://mirror.unl.edu/ctan/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
Original file name: hyph-_______________.tex
//============================================================================================================
ORIGINAL FILE INFO
//============================================================================================================
*/
$patgenLanguage = ""; //Common name for language
$patgenExceptions = array(); // list of exceptions in the form of: array('associate'=>'as-so-ciate','associates'=>'as-so-ciates')
$patgenMaxSeg = 3; // maximum segment length in the patterns below (NOTE: Segment Lenght, not sequence length)
$patgen = array('begin'=>array(),'end'=>array(),'all'=>array()); // key/values should be formatted so that the key is the relevant word segment and the value is the related patgen sequence. For example: array('begin'=>array('ach'=>'0004'),'end'=>array('ab'=>'400'),'all'=>array('aba'=>'0501'))
// Reformatting original TeX/patgen patterns is less than convienent, but it greatly improves PHP preformance.
// Some additional help in formatting these files for use in the PHP Typography project:
// They original TEX pattern files have lists of segments formatted like this:
// .ach4
// 4ab.
// a5bal
// If a segment includes a period, it indicates that it should only be applied at the beginning or end of words (reletive to its position)
// If a period appears before the original TeX pattern, it belongs in the 'begin' subarray of the $patgen variable
// If a period appears after the original TeX pattern, it belongs in the 'end' subarray of the $patgen variable
// If the original TeX pattern does not contain a period, it belongs in the 'all' subarray of the $patgen variable
//
// The word segement is derived from the original TeX pattern by stripping all numbers and periods. Thus:
// .ach4 becomes ach
// 4ab. becomes ab
// a5ba1 becomes aba
//
// The patgen sequence is derived from the original TeX pattern by
// 1) removing any period
// 2) placing a "0" between any adjacent letters and at the beginning or end of the TeX pattern (if there is not already a number)
// 3) stripping all letters
// Thus:
// .ach4 > ach4 > 0a0c0h4 > 0004
// 4ab. > 4ab > 4a0b0 > 400
// a5ba1 > 0a5b0a1 > 0501
//
// The final sequence should be one character longer in length than the related word segment
//
// Lastly, the file name must be the languages "Language Code" formatted according to W3C format for language codes (see http://www.w3.org/TR/REC-html40/struct/dirlang.html#langcodes)
//
// To activate the new language defination, simply save the file to the /php-typography/lang/ directory
?>

View File

@ -0,0 +1,37 @@
1.20 - December 20, 2009
Added HTML5 elements to parsing algorithm for greater contextual awareness
1.19 - December 1, 2009
Corrected some uninitiated variables
1.12 - August 17, 2009
Corrected multibyte handling of nextChr and prevChr
1.10 - August 14, 2009
Increased set of recognized multibyte word characters
Corrected multibyte handling of nextChr and prevChr
1.4 - July 23, 2009
Added letter connectors (like soft-hyphens) as prohibited characters for get_words if it is set to strictly return letter only words.
1.3 - July 23, 2009
Uninitialized variables corrected throughout.
1.0 - July 15, 2009
Removed beta tag
1.0 beta 7 - July 10, 2009
added "/" as a valid word character so we could capture "this/that" as a word for processing (similar to "mother-in-law")
Corrected error where characters from the Latin 1 Supplement Block were not recognized as word characters
1.0 beta 1
initial release

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,511 @@
<?php
/*
Project Name: PHP Parser
URI: http://kingdesk.com/projects/php-parser/
Author: Jeffrey D. King
Author URI: http://kingdesk.com/about/jeff/
Copyright 2009, KINGdesk, LLC. Licensed under the GNU General Public License 2.0. If you use, modify and/or redistribute this software, you must leave the KINGdesk, LLC copyright information, the request for a link to http://kingdesk.com, and the web design services contact information unchanged. If you redistribute this software, or any derivative, it must be released under the GNU General Public License 2.0. This program is distributed without warranty (implied or otherwise) of suitability for any particular purpose. See the GNU General Public License for full license terms <http://creativecommons.org/licenses/GPL/2.0/>.
WE DON'T WANT YOUR MONEY: NO TIPS NECESSARY! If you enjoy this plugin, a link to http://kingdesk.com from your website would be appreciated.
For web design services, please contact info@kingdesk.com.
*/
#########################################################################################################
#########################################################################################################
##
## parseText assumes no HTML markup in text (except for special html characters like &gt;)
##
## if multibyte characters are passed, encoding must be UTF-8
##
#########################################################################################################
#########################################################################################################
class parseText {
var $mb = FALSE; //changes to this must occur prior to load
var $parsedHTML;
var $text = array();
/*
$text structure:
ARRAY:
index => ARRAY: tokenized Text
// REQUIRED
"type" => STRING: "space" | "punctuation" | "word" | "other"
"value" => STRING: token content
"parents" => ARRAY: parent tags: "index" => array("tagName" => tagName, "attributes" => array(name => value, ... ))
// elements must be assigned this value if it has a parent HTML element
*/
#=======================================================================
#=======================================================================
#== METHODS
#=======================================================================
#=======================================================================
########################################################################
# ( UN | RE )LOAD, UPDATE AND CLEAR METHODS
#
# Params: $rawText STRING containing HTML markup OR ARRAY containg a single parseHTML token
# Action: Tokenizes $rawText (or $rawText["value"] - as the case may be) and saves it to $this->text
# Returns: TRUE on completion
function load($rawText) {
$this->clear();
if(is_string($rawText)) {
// not passed a token of class parseHTML so we will fake it
$this->parsedHTML = "";
} elseif(is_array($rawText)) {
// passed an instance of a parseHTML token
$this->parsedHTML = $rawText;
$rawText = $rawText["value"];
} else {
// we have an error
return FALSE;
}
$encodings = array("ASCII","UTF-8", "ISO-8859-1");
$encoding = mb_detect_encoding($rawText."a", $encodings);
if("UTF-8" == $encoding) {
$this->mb = TRUE;
if(!function_exists('mb_strlen')) return FALSE;
} elseif("ASCII" != $encoding) {
return FALSE;
}
$utf8 = ($this->mb) ? "u" : "";
$tokens = array();
# find spacing FIRST (as it is the primary delimiter)
# find the HTML character representation for the following characters:
# tab | line feed | carriage return | space | non-breaking space | ethiopic wordspace
# ogham space mark | en quad space | em quad space | en-space | three-per-em space
# four-per-em space | six-per-em space | figure space | punctuation space | em-space
# thin space | hair space | narrow no-break space
# medium mathematical space | ideographic space
# Some characters are used inside words, we will not count these as a space for the purpose
# of finding word boundaries:
# zero-width-space ("&#8203;", "&#x200b;")
# zero-width-joiner ("&#8204;", "&#x200c;", "&zwj;")
# zero-width-non-joiner ("&#8205;", "&#x200d;", "&zwnj;")
$htmlSpaces = '
(?:
(?: # alpha matches
&
(?: nbsp|ensp|emsp|thinsp )
;
)
|
(?: # decimal matches
&\#
(?: 09|1[03]|32|160|4961|5760|819[2-9]|820[0-2]|8239|8287|12288 )
;
)
|
(?: # hexidecimal matches
&\#x
(?: 000[9ad]|0020|00a0|1361|1680|200[0-9a]|202f|205f|3000 )
;
)
|
(?: # actual characters
\x{0009}|\x{000a}|\x{000d}|\x{0020}|\x{00a0}|\x{1361}|\x{2000}|\x{2001}|\x{2002}|\x{2003}|
\x{2004}|\x{2005}|\x{2006}|\x{2007}|\x{2008}|\x{2009}|\x{200a}|\x{202f}|\x{205f}|\x{3000}
)
)
'; // required modifiers: x (multiline pattern) i (case insensitive) u (utf8)
$space = "(?:\s|$htmlSpaces)+"; // required modifiers: x (multiline pattern) i (case insensitive) $utf8
# find punctuation and symbols before words (to capture preceeding delimiating characters like hyphens or underscores)
# see http://www.unicode.org/charts/PDF/U2000.pdf
# see http://www.unicode.org/charts/PDF/U2E00.pdf
# find punctuation and symbols
# dec matches = 33-44|46-47|58-60|62-64|91-94|96|123-126|161-172|174-191|215|247|710|732|977-978|982|8211-8231|8240-8286|8289-8292|8352-8399|8448-8527|8592-9215|9632-9983|11776-11903
# hex matches = 0021-002c|002e-002f|003a-003c|003e-0040|005b-e|0060|007b-007e|00a1-00ac|00ae-00bf|00d7|00f7|02c6|02dc|03d1-03d2|
# 03d6|2013-2027|2030-205e|2061-2064|20a0-20cf|2100-214f|2190-23ff|25a0-26ff|2e00-2e7f
#
# Some characters are used inside words, we will not count these as a space for the purpose
# of finding word boundaries:
# hyphens ("&#45;", "&#173;", "&#8208;", "&#8209;", "&#8210;", "&#x002d;", "&#x00ad;", "&#x2010;", "&#x2011;", "&#x2012;", "&shy;")
# underscore ("&#95;", "&#x005f;")
$htmlPunctuation = '
(?:
(?: # alpha matches
&
(?:quot|amp|frasl|lt|gt|iexcl|cent|pound|curren|yen|brvbar|sect|uml|pound|ordf|laquo|not|reg|macr|deg|plusmn|sup2|sup3|acute|micro|para|middot|cedil|sup1|ordm|raquo|frac14|frac12|frac34|iquest|times|divide|circ|tilde|thetasym|upsih|piv|ndash|mdash|lsquo|rsquo|sbquo|ldquo|rdquo|bdquo|dagger|Dagger|bull|hellip|permil|prime|Prime|lsaquo|rsaquo|oline|frasl|euro|trade|alefsym|larr|uarr|rarr|darr|harr|crarr|lArr|uArr|rArr|dArr|hArr|forall|part|exist|emptyn|abla|isin|notin|ni|prod|sum|minus|lowast|radic|prop|infin|ang|and|orc|ap|cup|int|there4|simc|ong|asymp|ne|equiv|le|ge|sub|supn|sub|sube|supe|oplus|otimes|perp|sdot|lceil|rceil|lfloor|rfloor|lang|rang|loz|spades|clubs|hearts|diams)
;
)
|
(?: # decimal matches
&\#
(?: 3[3-9]|4[0-467]|5[89]|6[02-4]|9[1-46]|12[3-6]|16[1-9]|17[0-24-9]|18[0-9]|19[01]|215|247|710|732|97[78]|982|821[1-9]|822[0-9]|823[01]|82[4-7][0-9]|828[0-6]|8289|829[0-2]|835[2-9]|86[6-9][0-9]|844[89]|84[5-9][0-9]|851[0-9]|852[0-7]|859[2-9]|85[6-9][0-9]|8[6-9][0-9][0-9]|9[01][0-9][0-9]|920[0-9]|921[0-5]|963[2-9]|96[4-9][0-9]|9[78][0-9][0-9]|99[0-7][0-9]|998[0-3]|1177[6-9]|117[89][0-9]|118[0-9][0-9]|1190[0-3] )
;
)
|
(?: # hexidecimal matches
&\#x
(?: 002[1-9a-cef]|003[a-cef]|0040|005[b-e]|0060|007[b-e]|00a[1-9a-cef]|00b[0-9a-f]|00d7|00f7|02c6|02dc|03d[126]|201[3-9a-f]|202[0-7]|20[34][0-9a-f]|205[0-9a-e]|206[1-4]|20[a-c][0-9a-f]|21[0-4][0-9a-f]|219[0-9a-f]|2[23][0-9a-f][0-9a-f]|25[a-f][0-9a-f]|23[0-9a-f][0-9a-f]|2e[0-7][0-9a-f] )
;
)
)
'; // required modifiers: x (multiline pattern) i (case insensitive) u (utf8)
$punctuation = "
(?:
(?:
[^\w\s\&\/\@] # assume characters that are not word spaces or whitespace are punctuation
# exclude & as that is an illegal stand-alone character (and would interfere with HTML character representations
# exclude slash \/as to not include the last slash in a URL
# exclude @ as to keep twitter names together
|
$htmlPunctuation # catch any HTML reps of punctuation
)+
)
";// required modifiers: x (multiline pattern) i (case insensitive) u (utf8)
// duplicated in get_words
// letter connectors allowed in words
# hyphens ("&#45;", "&#173;", "&#8208;", "&#8209;", "&#8210;", "&#x002d;", "&#x00ad;", "&#x2010;", "&#x2011;", "&#x2012;", "&shy;")
# underscore ("&#95;", "&#x005f;")
# zero-width-space ("&#8203;", "&#x200b;")
# zero-width-joiner ("&#8204;", "&#x200c;", "&zwj;")
# zero-width-non-joiner ("&#8205;", "&#x200d;", "&zwnj;")
$htmlLetterConnectors = '
(?:
(?: # alpha matches
&
(?: shy|zwj|zwnj )
;
)
|
(?: # decimal matches
&\#
(?: 45|95|173|820[3-589]|8210 )
;
)
|
(?: # hexidecimal matches
&\#x
(?: 002d|005f|00ad|200[b-d]|201[0-2] )
;
)
|
(?: # actual characters
\x{002d}|\x{005f}|\x{00ad}|\x{200b}|\x{200c}|\x{200d}|\x{2010}|\x{2011}|\x{2012}
)
)
'; // required modifiers: x (multiline pattern) i (case insensitive) u (utf8)
// word character html entities
// character 0-9__ A-Z__ a-z___ other_special_chrs_____
// decimal 48-57 65-90 97-122 192-214,216-246,248-255, 256-383
// hex 31-39 41-5a 61-7a c0-d6 d8-f6 f8-ff 0100-017f
$htmlLetters = '
(?:
(?: # alpha matches
&
(?:Agrave|Aacute|Acirc|Atilde|Auml|Aring|AElig|Ccedil|Egrave|Eacute|Ecirc|Euml|Igrave|Iacute|Icirc|Iuml|ETH|Ntilde|Ograve|Oacute|Ocirc|Otilde|Ouml|Oslash|Ugrave|Uacute|Ucirc|Uuml|Yacute|THORN|szlig|agrave|aacute|acirc|atilde|auml|aring|aelig|ccedil|egrave|eacute|ecirc|euml|igrave|iacute|icirc|iuml|eth|ntilde|ograve|oacute|ocirc|otilde|ouml|oslash|ugrave|uacute|ucirc|uuml|yacute|thorn|yuml)
;
)
|
(?: # decimal matches
&\#
(?: 4[89]|5[0-7]|9[7-9]|1[01][0-9]|12[0-2]|19[2-9]|20[0-9]|21[0-46-9]|2[23][0-9]|24[0-68-9]|2[5-9][0-9]|3[0-7][0-9]|38[0-3] )
;
)
|
(?: # hexidecimal matches
(?:
&\#x00
(?: 3[1-9]|4[1-9a-f]|5[0-9a]|6[1-9a-f]|7[0-9a]|c[0-9a-f]|d[0-689]|e[0-9a-f]|f[0-689a-f] )
;
)
|
(?:
&\#x01[0-7][0-9a-f];
)
)
|
(?: # actual characters
[0-9A-Za-z]|\x{00c0}|\x{00c1}|\x{00c2}|\x{00c3}|\x{00c4}|\x{00c5}|\x{00c6}|\x{00c7}|\x{00c8}|\x{00c9}|
\x{00ca}|\x{00cb}|\x{00cc}|\x{00cd}|\x{00ce}|\x{00cf}|\x{00d0}|\x{00d1}|\x{00d2}|\x{00d3}|\x{00d4}|
\x{00d5}|\x{00d6}|\x{00d8}|\x{00d9}|\x{00da}|\x{00db}|\x{00dc}|\x{00dd}|\x{00de}|\x{00df}|\x{00e0}|
\x{00e1}|\x{00e2}|\x{00e3}|\x{00e4}|\x{00e5}|\x{00e6}|\x{00e7}|\x{00e8}|\x{00e9}|\x{00ea}|\x{00eb}|
\x{00ec}|\x{00ed}|\x{00ee}|\x{00ef}|\x{00f0}|\x{00f1}|\x{00f2}|\x{00f3}|\x{00f4}|\x{00f5}|\x{00f6}|
\x{00f8}|\x{00f9}|\x{00fa}|\x{00fb}|\x{00fc}|\x{00fd}|\x{00fe}|\x{00ff}|\x{0100}|\x{0101}|\x{0102}|
\x{0103}|\x{0104}|\x{0105}|\x{0106}|\x{0107}|\x{0108}|\x{0109}|\x{010a}|\x{010b}|\x{010c}|\x{010d}|
\x{010e}|\x{010f}|\x{0110}|\x{0111}|\x{0112}|\x{0113}|\x{0114}|\x{0115}|\x{0116}|\x{0117}|\x{0118}|
\x{0119}|\x{011a}|\x{011b}|\x{011c}|\x{011d}|\x{011e}|\x{011f}|\x{0120}|\x{0121}|\x{0122}|\x{0123}|
\x{0124}|\x{0125}|\x{0126}|\x{0127}|\x{0128}|\x{0129}|\x{012a}|\x{012b}|\x{012c}|\x{012d}|\x{012e}|
\x{012f}|\x{0130}|\x{0131}|\x{0132}|\x{0133}|\x{0134}|\x{0135}|\x{0136}|\x{0137}|\x{0138}|\x{0139}|
\x{013a}|\x{013b}|\x{013c}|\x{013d}|\x{013e}|\x{013f}|\x{0140}|\x{0141}|\x{0142}|\x{0143}|\x{0144}|
\x{0145}|\x{0146}|\x{0147}|\x{0148}|\x{0149}|\x{014a}|\x{014b}|\x{014c}|\x{014d}|\x{014e}|\x{014f}|
\x{0150}|\x{0151}|\x{0152}|\x{0153}|\x{0154}|\x{0155}|\x{0156}|\x{0157}|\x{0158}|\x{0159}|\x{015a}|
\x{015b}|\x{015c}|\x{015d}|\x{015e}|\x{015f}|\x{0160}|\x{0161}|\x{0162}|\x{0163}|\x{0164}|\x{0165}|
\x{0166}|\x{0167}|\x{0168}|\x{0169}|\x{016a}|\x{016b}|\x{016c}|\x{016d}|\x{016e}|\x{016f}|\x{0170}|
\x{0171}|\x{0172}|\x{0173}|\x{0174}|\x{0175}|\x{0176}|\x{0177}|\x{0178}|\x{0179}|\x{017a}|\x{017b}|
\x{017c}|\x{017d}|\x{017e}|\x{017f}
)
)
'; // required modifiers: x (multiline pattern) i (case insensitive) u (utf8)
$word = "
(?:
(?<![\w\&]) # negative lookbehind to ensure
# 1) we are proceeded by a non-word-character, and
# 2) we are not inside an HTML character def
(?:
[\w\-\_\/]
|
$htmlLetters
|
$htmlLetterConnectors
)+
)
"; // required modifiers: x (multiline pattern) u (utf8)
# find any text
$anyText = "$space|$punctuation|$word"; // required modifiers: x (multiline pattern) i (case insensitive) u (utf8)
$parts = preg_split("/($anyText)/ixu", $rawText, -1, PREG_SPLIT_DELIM_CAPTURE);
$index = 0;
foreach ($parts as $part) {
if ($part != "") {
if(preg_match("/\A$space\Z/xiu", $part)) {
$tokens[$index] = array(
"type" => 'space',
"value" => $part,
);
} elseif(preg_match("/\A$punctuation\Z/sxiu", $part)) {
$tokens[$index] = array(
"type" => 'punctuation',
"value" => $part,
);
} elseif(preg_match("/\A$word\Z/xu", $part)) {
//make sure that things like email addresses and URLs are not broken up into words and punctuation
// not preceeded by an "other"
if($index-1 >= 0 && $tokens[$index-1]['type'] == 'other') {
$oldPart = $tokens[$index-1]['value'];
$tokens[$index-1] = array(
"type" => 'other',
"value" => $oldPart.$part,
);
$index = $index-1;
// not preceeded by a non-space + punctuation
} elseif($index-2 >= 0 && $tokens[$index-1]['type'] == 'punctuation' && $tokens[$index-2]['type'] != 'space') {
$oldPart = $tokens[$index-1]['value'];
$olderPart = $tokens[$index-2]['value'];
$tokens[$index-2] = array(
"type" => 'other',
"value" => $olderPart.$oldPart.$part,
);
unset($tokens[$index-1]);
$index = $index-2;
} else {
$tokens[$index] = array(
"type" => 'word',
"value" => $part,
);
}
} else {
//make sure that things like email addresses and URLs are not broken up into words and punctuation
// not preceeded by an "other" or "word"
if($index-1 >= 0 && ($tokens[$index-1]['type'] == 'word' || $tokens[$index-1]['type'] == 'other')) {
$index = $index-1;
$oldPart = $tokens[$index]['value'];
$tokens[$index] = array(
"type" => 'other',
"value" => $oldPart.$part,
);
// not preceeded by a non-space + punctuation
} elseif($index-2 >= 0 && $tokens[$index-1]['type'] == 'punctuation' && $tokens[$index-2]['type'] != 'space') {
$oldPart = $tokens[$index-1]['value'];
$olderPart = $tokens[$index-2]['value'];
$tokens[$index-2] = array(
"type" => 'other',
"value" => $olderPart.$oldPart.$part,
);
unset($tokens[$index-1]);
$index = $index-2;
} else {
$tokens[$index] = array(
"type" => 'other',
"value" => $part,
);
}
}
if(isset($this->parsedHTML["parents"]))
$tokens[$index]["parents"] = $this->parsedHTML["parents"];
$index++;
}
}
$this->text = $tokens;
return TRUE;
}
# Action: reloads $this->text (i.e. capture new inserted text, or remove those whose values are deleted)
# Returns: TRUE on completion
# WARNING: Tokens previously acquired through "get" methods may not match new tokenization
function reload() {
return $this->load($this->unload());
}
# Action: outputs Text as string
# Returns: STRING of Text (if string was initially loaded), or ARRAY of
function unload() {
$reassembledText = "";
foreach($this->text as $token) {
$reassembledText .= $token["value"];
}
if($this->parsedHTML != "") {
// the initial value loaded was a single token of class parseHTML, so we will return in the same format
$this->parsedHTML["value"] = $reassembledText;
$output = $this->parsedHTML;
} else {
// the initial value loaded was a string, so we will return in the same format
$output = $reassembledText;
}
$this->clear();
return $output;
}
# Action: unsets $this->text
# Returns: TRUE on completion
function clear() {
$this->text = array();
$this->parsedHTML = "";
return TRUE;
}
# Parameter: ARRAY of tokens
# Action: overwrite "value" for all matching tokens
# Returns: TRUE on completion
function update($tokens) {
foreach($tokens as $index => $token) {
$this->text[$index]["value"] = $token["value"];
}
return TRUE;
}
########################################################################
# GET METHODS
#
# Returns: ARRAY of sought tokens
function get_all() {
return $this->text;
}
function get_spaces() {
return $this->get_type("space");
}
function get_punctuation() {
return $this->get_type("punctuation");
}
# Parameter: $abc letter-only match OPTIONAL INT -1=>prohibit, 0=>allow, 1=>require
# $caps capital-only match (allows non letter chrs) OPTIONAL INT -1=>prohibit, 0=>allow, 1=>require
function get_words($abc = 0, $caps = 0) {
$words = $this->get_type("word");
$tokens = array();
//duplicated from load
$htmlLetterConnectors = '
(?:
(?: # alpha matches
&
(?: shy|zwj|zwnj )
;
)
|
(?: # decimal matches
&\#
(?: 45|95|173|820[3-589]|8210 )
;
)
|
(?: # hexidecimal matches
&\#x
(?: 002d|005f|00ad|200[b-d]|201[0-2] )
;
)
|
(?: # actual characters
\x{002d}|\x{005f}|\x{00ad}|\x{200b}|\x{200c}|\x{200d}|\x{2010}|\x{2011}|\x{2012}
)
)
'; // required modifiers: x (multiline pattern) i (case insensitive) u (utf8)
foreach($words as $index => $token) {
if($this->mb) {
$capped = mb_strtoupper($token["value"], "UTF-8");
$lettered = preg_replace("/".$htmlLetterConnectors."|[0-9\-_&#;\/]/ux", "", $token["value"]);
} else {
$capped = strtoupper($token["value"]);
$lettered = preg_replace("/".$htmlLetterConnectors."|[0-9\-_&#;\/]/ux", "", $token["value"]);
}
if( ($abc == -1 && $lettered != $token["value"]) && ($caps == -1 && $capped != $token["value"]) ) $tokens[$index] = $token;
elseif( ($abc == -1 && $lettered != $token["value"]) && $caps == 0 ) $tokens[$index] = $token;
elseif( ($abc == -1 && $lettered != $token["value"]) && ($caps == 1 && $capped == $token["value"]) ) $tokens[$index] = $token;
elseif( $abc == 0 && ($caps == -1 && $capped != $token["value"]) ) $tokens[$index] = $token;
elseif( $abc == 0 && $caps == 0 ) $tokens[$index] = $token;
elseif( $abc == 0 && ($caps == 1 && $capped == $token["value"]) ) $tokens[$index] = $token;
elseif( ($abc == 1 && $lettered == $token["value"]) && ($caps == -1 && $capped != $token["value"]) ) $tokens[$index] = $token;
elseif( ($abc == 1 && $lettered == $token["value"]) && $caps == 0 ) $tokens[$index] = $token;
elseif( ($abc == 1 && $lettered == $token["value"]) && ($caps == 1 && $capped == $token["value"]) ) $tokens[$index] = $token;
}
return $tokens;
}
function get_other() {
return $this->get_type("other");
}
#=======================================================================
#=======================================================================
#== MISC. METHODS
#=======================================================================
#=======================================================================
# Params: STRING type to get
function get_type($type) {
$tokens = array();
foreach($this->text as $index => $token) {
if($token["type"] == $type)
$tokens[$index] = $token;
}
return $tokens;
}
} // end class parseText

View File

@ -0,0 +1,25 @@
<?php
/*
Project Name: PHP Parser
URI: http://kingdesk.com/projects/php-parser/
Author: Jeffrey D. King
Author URI: http://kingdesk.com/about/jeff/
Version: 1.19
Copyright 2009, KINGdesk, LLC. Licensed under the GNU General Public License 2.0. If you use, modify and/or redistribute this software, you must leave the KINGdesk, LLC copyright information, the request for a link to http://kingdesk.com, and the web design services contact information unchanged. If you redistribute this software, or any derivative, it must be released under the GNU General Public License 2.0. This program is distributed without warranty (implied or otherwise) of suitability for any particular purpose. See the GNU General Public License for full license terms <http://creativecommons.org/licenses/GPL/2.0/>.
WE DON'T WANT YOUR MONEY: NO TIPS NECESSARY! If you enjoy this plugin, a link to http://kingdesk.com from your website would be appreciated.
For web design services, please contact info@kingdesk.com.
*/
# two classes defined:
# - parseHTML
# - parseText
#
# PHP Parser has been tested in PHP5. It may work in PHP4, but it has not been tested in that environment
# if you have problems or success in PHP4, please let us know at info@kingdesk.com
require_once('parseHTML.php');
require_once('parseText.php');

File diff suppressed because it is too large Load Diff

62
typography.php Normal file
View File

@ -0,0 +1,62 @@
<?php
/*
* Name: Typography
* Description: Applies typographical enhancements to the postings before displaying them
* Version: 0.1
* Author: Tobias Diekershoff <tobias@f.diekershoff.de>
* License: GPL 2.0
*/
function typography_install () {
register_hook ('prepare_body', 'addon/typography/typography.php', 'typography_render' );
}
function typography_uninstall () {
unregister_hook ('prepare_body', 'addon/typography/typography.php', 'typography_render' );
}
function typography_render ( &$a, &$o) {
require_once('php-typography/php-typography.php');
require_once('library/langdet/Text/LanguageDetect.php');
$typo = new phpTypography();
$lng_id = array(
'hu',
'is',
'tr',
'bg',
'cs',
'fi',
'fr',
'it',
'ro',
'es',
'pt',
'no',
'ru',
'sv',
'pl',
'en_GB',
'de');
$lng_long = array(
'hungarian',
'icelandic',
'turkish',
'bulgarian',
'czech',
'finnish',
'french',
'italian',
'romanian',
'spanish',
'portuguese',
'norwegian',
'russian',
'swedish',
'polish',
'english',
'german');
$l = new Text_LanguageDetect;
$lng = $l->detectSimple($o['html']);
$lng = str_replace( $lng_long, $lng_id, $lng);
$typo->settings["diacriticLanguage"] = $lng;
$o['html'] = $typo->process($o['html']);
unset($l);
}