Code Monkey home page Code Monkey logo

hsemail's Issues

Problem parsing UTF-8 characters

Hello. I'm using your library to parse some emails but it get stuck when it find a non US-ASCII character. For example, take a look at the following email:

To: <[email protected]>
Subject: A Subject
Date: Mon, 19 Aug 2013 15:30:17 -0300
From: Jorge anonimo <[email protected]>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="UTF-8"


Moodle -> Foros -> Foro de consultas de práctica -> Consulta
segundo examen de promoción

Hola queria saber que temas se incluiran en el segundo examen de promocion.

muchas gracias

Jorge

The result of parsing the email will be:

Message [OptionalField "To" " <[email protected]>",
         Subject " A Subject",
         Date (CalendarTime {ctYear = 2013, ctMonth = August, ctDay = 19, ctHour = 15, ctMin = 30, ctSec = 17, ctPicosec = 0, ctWDay = Monday, ctYDay = 0, ctTZName = "", ctTZ = -10800, ctIsDST = False}),
         From [NameAddr {nameAddr_name = Just "Jorge anonimo", nameAddr_addr = "[email protected]"}],
         OptionalField "MIME-Version" " 1.0",
         OptionalField "Content-Transfer-Encoding" " 8bit",
         OptionalField "Content-Type" " text/plain; charset=\"UTF-8\""]
         "\r\nMoodle -> Foros -> Foro de consultas de pr"

As you can see, the library cuts my email. Is this the expected behavior of the library? If it is, how would be a proper way of parsing this kind of emails?

Thanks in advance.

Cheers.

Could not find module ‘Data.Time.Calendar.Compat’

Trying to compile:

Configuring hsemail-2.2.1...
Preprocessing library for hsemail-2.2.1..
Building library for hsemail-2.2.1..

on the commandline: warning: ignoring -split-objs
[2 of 2] Compiling Text.Parsec.Rfc2822 ( src/Text/Parsec/Rfc2822.hs, dist/build/Text/Parsec/Rfc2822.o, dist/build/Text/Parsec/Rfc2822.dyn_o )

src/Text/Parsec/Rfc2822.hs:26:1: error:
    Could not find module ‘Data.Time.Calendar.Compat’
    There are files missing in the ‘time-compat-1.9.5’ package,
    try running 'ghc-pkg check'.
    Use -v (or `:set -v` in ghci) to see a list of the files searched for.
   |
26 | import Data.Time.Calendar.Compat
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I do have time-compat installed:

pacman -Q haskell-time-compat 
haskell-time-compat 1.9.5-47

Is that some sort of compatibility issue? Distro: Arch Linux x86_64, rolling

Parameterized or flexible newlines?

Thanks, this parser is almost exactly what I was looking for. I'm trying to use it to parse unix maildir files, which are in almost the right format except with unix newlines (lf, not crlf). I realize this means they're not strictly rfc compliant, but this seems like a fairly common variant in unix mailboxes. Would you be open to adding a version parameterized by the newline type in some way, or with an option for flexible newlines? I imagine generalized variants of the relevant functions like headerNL (though I realize this is nearly every function), or maybe a separate module. Happy to submit a PR for whichever approach.

(Alternatively I could just pre-process the streams to switch newlines, but this seems a bit messy.)

Release 2.2.1?

Hi, I just tried to use this package for the date parsing parser, but ran into the bug with the 31st not being accepted. I see this was fixed in master, but it hasn't been released yet. Any chance 2.2.1 could be released?

Is this a bug? \237 in an email name

import Text.Parsec (parse)
import Text.ParserCombinators.Parsec.Rfc2822 (address)

main = do
  parse address "" "Dav\237d Fox <nobody@nowhere>"

Output:

Left (line 1, column 9):
unexpected "\237"
expecting word, ".", white-space, carriage return followed by linefeed, comment or ":"

The character is "Unicode Character 'LATIN SMALL LETTER I WITH ACUTE' (U+00ED)"

Why are some cases handled differently?

I'm trying to understand the sourcecode, and I stumbled upon this:

day_name :: Stream s m Char => ParsecT s u m DayOfWeek
day_name = choice [ caseString "Mon" $> Monday
                  , try (caseString "Tue" $> Tuesday)
                  , caseString "Wed" $> Wednesday
                  , caseString "Thu" $> Thursday
                  , caseString "Fri" $> Friday
                  , try (caseString "Sat" $> Saturday)
                  , caseString "Sun" $> Sunday
                  ]
           <?> "name of a day-of-the-week"

Why do only Tue and Sat have an extra try call and what is it good for?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.