CVersionInfo4Dr. Detlef Meyer-EltzPParsergenerator and Interpreter1.6.2.1 Tetra.exe+Copyright 2002 - 09 Dr. Detlef Meyer-Eltz Tetra.exeTextTransformernormal BODY_TEXTCTokenK BODY_TEXT[^\r\n]{1,998}BOUNDARY_BEGINCTokenԦKBOUNDARY_BEGIN {DYNAMIC} BOUNDARY_ENDCTokenK BOUNDARY_END {DYNAMIC}CRLFCToken2CRLF\r\nCTEXTCTokennKCTEXT3specification as single character changed to repeat[^()\\\t\n\r \x7F-\xFF]+DIGITSCTokenh4DIGITS\d+FTEXTCTokenD5FTEXTMcharacters that have values between 33. and 126., decimal, except colon[^\x{00}- :\x{7f}]+FTEXT_SPECIALSCToken 6FTEXT_SPECIALS'[-!\"#$%&'()*+,./;<=>?@\[\\\]\^_`{|}~]+FWSCToken6FWSfolding white space (\r\n[ \t]+)+LABELCTokenL,LABELhFTEXT characters that have values between 33. and 126., decimal, except colon, followed by a colon[^\x{00}- :\x{7f}]+[ \t]*:NBSPCToken(-NBSPgeschtztes Leerzeichen\xA0 NON_ASCIICToken. NON_ASCII [\x7F-\xFF]+NOT_LABEL_OCTETCToken.NOT_LABEL_OCTET[^\r\n:]OCTET1CToken/OCTET1cnot '\r' and not '\n' and not characters that have values between 33. and 126 decimal (FTEXT) [^\r\n!-~]+OCTET2CToken̆KOCTET2#beginning with single '\r' or '\n' ([\r\n][^\r\n:]+)+OCTET3CTokenKOCTET3cnot '\r' and not '\n' and not characters that have values between 33. and 126 decimal (FTEXT)[^\r\n[:alpha:][:digit:]]+OCTET4CTokenKOCTET4 [\r\n][^\r\n[:alpha:][:digit:]]+OCTET5CToken`KOCTET5[\r\n]([[:alpha:][:digit:]]+)OCTETSCToken@,;:\\"/\[\]?=\x00-\x1F]+WORDCTokenKWORD [[:alpha:]]+_messageCTokenK_messagemessage _multipartCToken|K _multipart multipart_plainCTokenhK_plainplain_rfc822CTokenTK_rfc822rfc822_textCToken@K_texttext" attributeCProduction_LINK\K attribute2Matching of attributes is ALWAYS case-insensitive. token 4` commentbodyCProduction_LINKKbodyJCRLF+ text_element_ex ( text_element_ex | CRLF )* 4` body_partCProduction_LINKK body_part"message" as defined in RFC 822, with all header fields optional, not starting with the specified dash-boundary, and with the delimiter not occurring anywhere in the body part. Note that the semantics of a part differ from the semantics of a message, as described in the text.{{ InitParams(); }} // default if not explicit CRLF field* ( IF(IsText()) octets_begin mime_text? ELSE octets END )? 4`ccontentCProduction_LINKKccontent- CTEXT FWS? | QUOTED_PAIR FWS? | comment 4`commentCProduction_LINKKcomment"(" FWS? ccontent* ")" FWS? 4`composite_typeCProduction_LINKxKcomposite_type9 "message" | _multipart | extension_token 4` commentcontentCProduction_LINK4KcontentMatching of media type and subtype is ALWAYS case-insensitive. The type, subtype, and parameter names are not case sensitive. For example, TEXT, Text, and TeXt are all equivalent top-level media types. Parameter values are normally case sensitive, but sometimes are interpreted in a case-insensitive fashion, depending on the intended use. (For example, multipart boundaries are case-sensitive, but the "access-type" parameter for message/External-body is not case-sensitive.) //"Content-Type" ":" "Content-Type:" FWS? //{{ InitParams(); }} type FWS? "/" FWS? subtype FWS? ( ";" FWS? ( parameter | {{m_iResult = -1; log << "missing content-type parameter" << endl; }} ) )* CRLF 4` comment discrete_typeCProduction_LINKK discrete_typed "text" | "image" | "audio" | "video" | "application" // provisorisch | extension_token 4` comment encapsulationCProduction_LINKK encapsulationSBOUNDARY_BEGIN body_part ( BOUNDARY_END {{ PopScope(); }} epilogue? )? epilogueCProduction_LINKPKepilogue*//discard_text CRLF (BODY_TEXT* CRLF )* extension_tokenCProduction_LINK Kextension_token ietf_token | x_token 4` commentfieldCProduction_LINKKfield$ general_field | content 4` comment field_bodyCProduction_LINKK field_body-field_body_contents ( FWS field_body )? 4` commentfield_body_contentsCProduction_LINK@Kfield_body_contentsthe ASCII characters making up the field-body, as defined in the following sections, and consisting of combinations of atom, quoted-string, and specials tokens, or else consisting of texts0( text_element_ex | QUOTED_PAIR )+ 4` commentftextCProduction_LINKKftextMcharacters that have values between 33. and 126., decimal, except colon1( WORD | DIGITS | FTEXT_SPECIALS )+ 4` comment general_fieldCProduction_LINK@K general_field(//FTEXT ":" LABEL field_body? CRLF 4` comment ietf_tokenCProduction_LINKK ietf_tokenMAn extension token defined by a standards-track RFC and registered with IANA.SKIP 4` commentmessageCProduction_LINKmessageTests, whether the e-mail is constructed formally correct. If the mail is parsed 0 is returned. Parser errors should be regarded as spam.{{ m_iResult = 0; }} field* IF(IsMultiPart()) multipart_body ELSE body? END {{ out << m_iResult; // has to be called before CopyToDisk CopyToDisk(); }} 4` mime_textCProduction_LINK mime_text( text_element_ex | "(" // prior to comment-inclusion of calling production | CRLF // makes a stop before a possible BOUNDARY )+ 0`\t multipart_bodyCProduction_LINKmultipart_bodypbody_part? ( IF(IsMultiPart()) multipart_encapsulation ELSE encapsulation END )* 4`multipart_encapsulationCProduction_LINK multipart_encapsulationBOUNDARY_BEGIN body_part ( IF(IsMultiPart()) multipart_encapsulation ELSE encapsulation END )* ( BOUNDARY_END {{ PopScope(); }} epilogue? )? // might be closed at the end of the last encapsulation already 4` commentoctetCProduction_LINKT octetno comment inclusion OCTETS | CRLF // makes a stop before a possible BOUNDARY // followed by longer expression BOUNDARY or EOF 4`octetsCProduction_LINK octetsoctets_begin octet* 0`\t octets_beginCProduction_LINK octets_beginno comment inclusion WORD | DIGITS | FTEXT_SPECIALS | ":" // so the field alternative can be tested in a look-ahead | OCTET1 | OCTET2 | CRLF // makes a stop before a possible BOUNDARY // followed by longer expression BOUNDARY or EOF 4` parameterCProduction_LINK parameter{{ str sValue; }} ( attribute "=" value[sValue] | "boundary" "=" ( value[sValue] | SKIP // e.g.: ----=_NextPart_000_0023_91_0CD33694.F112A954 is not recognized as TOKEN, because of the '=' {{ sValue = xState.str(); m_iResult = -1; log << "BAD_BOUNDARY" << endl; }} ) {{ str sCrLf = "\r\n"; AddToken(sCrLf + m_sDelimConst + sValue, "BOUNDARY_BEGIN", sValue); AddToken(sCrLf + m_sDelimConst + sValue + m_sDelimConst, "BOUNDARY_END", sValue); PushScope(sValue); m_sBoundary = sValue; }} | "charset" "=" value[sValue] {{ m_sCharset = sValue; }} ) 4` commentqcontentCProduction_LINKqcontentV QTEXT | QUOTED_PAIR | NON_ASCII {{m_iResult = -1; log << "bad qtext" << endl; }} 4` comment quoted_stringCProduction_LINK quoted_string#"\"" FWS? ( qcontent FWS? )* "\"" 4` commentsubtypeCProduction_LINKT subtype( SKIP | _plain // text/plain empirisch | "html" // text/html | _rfc822 // message/rfc822 | "enriched" | "jpeg" | "gif" | "audio" | "basic" | "video" | "mpeg" | "octet-stream" | "PostScript" | "partial" | "external-body" | "mixed" | "alternative" | "parallel" | "digest" ) {{ m_sSubType = xState.str(); }} /* extension_token | iana_token */ 4` comment text_elementCProduction_LINK% text_elementno comment inclusionb WORD | DIGITS | STRING | PUNCTUATION | SPECIAL | NBSP 4`text_element_exCProduction_LINK<text_element_exD text_element | ( "\r" | "\n" ) ( text_element | EOF ) \t tokenCProduction_LINK<'token71*TOKEN 4` commenttypeCProduction_LINK(typeO( discrete_type | composite_type ) {{ m_sType = xState.str(); }} 4` commentvalueCProduction_LINK*valueParameter values are normally case sensitive, but sometimes are interpreted in a case-insensitive fashion, depending on the intended use. (For example, multipart boundaries are case-sensitive, but the "access-type" parameter for message/External-body is not case-sensitive.) Note that the value of a quoted string parameter does not include the quotes. That is, the quotation marks in a quoted-string are not a part of the value of the parameter, but are merely used to delimit that parameter value.{ token {{ xsValue = xState.str(); }} | STRING //quoted_string {{ xsValue = xState.str(1); }} str& xsValue4` commentx_tokenCProduction_LINK"x_tokenXThe two characters "X-" or "x-" followed, with no intervening white space, by any token "X-" SKIP 4` comment CopyToDiskCElementScriptK CopyToDiskA{{ if(!ExtraParam().empty()) { str sTestDir = "D:\\Tetra\\Projects\\TextTransformer\\impfilter\\Complete"; str sTestFile = append_path(sTestDir, ExtraParam() + ".txt"); if(exists(sTestDir)) { RedirectOutputBinary(sTestFile); out << xState.text(0); //ResetOutput(); } } }} InitParamsCElementScriptK InitParams|If no Content-Type field is present it is assumed to be "message/rfc822" in a "multipart/digest" and "text/plain" otherwise.i{{ m_sBoundary = ""; m_sCharset = "us-ascii"; m_sType = ""; m_sSubType = ""; m_sEncoding = ""; }} IsHTMLCElementScriptKIsHTMLZ{{ if(m_sType == "text") return m_sSubType == "html"; else return false; }} bool IsMultiPartCElementScriptL IsMultiPart({{ return m_sType == "multipart"; }} boolIsTextCElementScriptIsText{{ if(m_sType == "text") return m_sSubType == "plain"; else if(m_sType == "message") return m_sSubType == "rfc822"; else return false; }} bool m_iResultCElementScriptX m_iResultint m_sBoundaryCElementScriptK m_sBoundarystr m_sCharsetCElementScript, m_sCharset"{{ m_sCharset = "US-ASCII"; }} str m_sDelimConstCElementScript,- m_sDelimConst{{ m_sDelimConst = "--"; }} str m_sEncodingCElementScript- m_sEncodingstr m_sSubTypeCElementScript0 m_sSubTypestrm_sTypeCElementScript\1m_sTypestr OnParseErrorCElementScript2 OnParseError_{{ if(!ExtraParam().empty()) { str sTestDir = "D:\\Tetra\\Projects\\TextTransformer\\impfilter\\Complete\\ParseError"; str sTestFile = append_path(sTestDir, ExtraParam() + ".txt"); if(exists(sTestDir)) { RedirectOutputBinary(sTestFile); out << xState.text(0) << xState.str(-2); //ResetOutput(); } } }} COptionsProject0GProjectOptionsProjectOptionsCOptionSectionqKProjectOptions; CaseSensitiveCScript< CaseSensitive0CharTypeTemplateCScript4KCharTypeTemplate1 CommentToCodeCScript,K CommentToCode0ComponentSupportCScript0ComponentSupport'D:\TextTransformer\Frames\enums_pas.frm ConfigParamCScript7 ConfigParam""CopyCodeCScriptCopyCode0Cpp_PrjParserHeaderCScriptxGCpp_PrjParserHeader(D:\TextTransformer\Frames\ttparser_h.frmCpp_PrjParserSourceCScriptGCpp_PrjParserSource(D:\TextTransformer\Frames\ttparser_c.frmCreateConstProductionsCScript@HCreateConstProductions0CreateInterfaceCScriptHCreateInterface0CreateWideCharRegexCScriptICreateWideCharRegex0DOMBOMCScriptlIDOMBOM0DOMDefaultLabelCScriptIDOMDefaultLabelemptyDOMDocTypeNameCScript4JDOMDocTypeName DOMEncodingCScriptJ DOMEncodingUTF-8DOMPrettyPrintCScriptJDOMPrettyPrint1 DOMPublicIDCScript`K DOMPublicID DOMRootLabelCScriptK DOMRootLabelroot DOMStandaloneCScript(L DOMStandalone1 DOMSystemIDCScriptL DOMSystemIDDOMWriteDeclarationCScriptLDOMWriteDeclaration1 ExportableCScriptTM Exportable1 ExtraParamCScriptM ExtraParam""GlobalLiteralScannerCScriptNGlobalLiteralScanner1GlobalRegexScannerCScriptNGlobalRegexScanner0 IgnoreCharsCScriptN IgnoreChars\tIgnoreWhiteSpaceCScriptHOIgnoreWhiteSpace1InclusionOverlapWarningCScriptOInclusionOverlapWarning1 InclusionProdCScriptP InclusionProd IndentCharCScripttP IndentCharws IndentDeltaCScriptP IndentDelta2 InterpretableCScript