cl-olefs

OLE File System tools for Common Lisp
git clone https://logand.com/git/cl-olefs.git/
Log | Files | Refs

README.org (10163B)


      1 #+title: cl-olefs
      2 #+author: Tomas Hlavaty
      3 #+options: creator:nil 
      4 
      5 cl-olefs
      6 
      7 Project home: http://logand.com/sw/cl-olefs.html
      8 
      9 Contact: http://logand.com/contact.html
     10 
     11 Up: http://logand.com/sw/
     12 
     13 * Introduction
     14 
     15 cl-olefs is a library for reading MS Office files (PPT, DOC, XLS)
     16 implemented in portable Common Lisp.
     17 
     18 It is licensed under the MIT style licence.
     19 
     20 There are no dependencies on sbcl and ccl.  On other Common Lisp
     21 implementations, the only dependency is on [[https://github.com/marijnh/ieee-floats][ieee-floats]].
     22 
     23 * Download and install
     24 
     25 Download the source code:
     26 
     27 : $ git clone http://logand.com/git/cl-olefs.git
     28 
     29 Then set up the Common Lisp environment to find the cl-olefs.asd file
     30 and load the system into the Lisp image.
     31 
     32 * Usage
     33 
     34 There are no exported symbols from the olefs package yet, while it is
     35 not sufficiently clear to me, what a good API should look like.
     36 However, it is already possible to:
     37 
     38 - read PPT files, transform them to HTML and extract images
     39 - partially read DOC files, especially formatting records
     40 - read XLS files; there are many missing features, but it is already
     41   possible to parse text cells and cells containing (double) floating
     42   numbers.
     43 
     44 Because, there is no API yet, use M-. instead of the documentation
     45 (are you using [[http://common-lisp.net/project/slime/][Slime]] yet?).  The code should be simple to follow,
     46 except the bizarre file format rules which are documented by MS and
     47 others.
     48 
     49 ** products.xls example
     50 
     51 From https://base.google.com/base/products.xls
     52 
     53 | id | title              | description                                                                        | link                                        | price | brand | condition | image link                        |          isbn | mpn    |           upc | weight | product type                                                  | quantity | shipping                              | tax                         |
     54 |----+--------------------+------------------------------------------------------------------------------------+---------------------------------------------+-------+-------+-----------+-----------------------------------+---------------+--------+---------------+--------+---------------------------------------------------------------+----------+---------------------------------------+-----------------------------|
     55 |  1 | Red wool sweater   | Comfortable and soft, this sweater will keep you warm on those cold winter nights. | http://www.example.com/item1-info-page.html |    25 | Acme  | new       | http://www.example.com/image1.jpg |               | ABC123 | 0001230001232 | 0.1 lb | "Clothing & Accessories > Clothing > Outerwear > Sweaters"    |        3 | US:MA:Ground:5.95,US:024*:Ground:7.95 | US:CA:8.25:y,US:926*:8.75:y |
     56 |  2 | Total Recall       | Slightly used copy of Total Recall, the sci-fi classic, on DVD.                    | http://www.example.com/item2-info-page.html |    12 | Acme  | used      | http://www.example.com/image2.jpg |               | XXYYZZ | 0004560004560 | 0.2 lb | "Media > DVDs & Videos > Science Fiction"                     |        1 | US:MA:Ground:5.95                     | US::0:                      |
     57 |  3 | Winnie the Pooh    | Used copy. The adventures of Christopher Robin and his friends.                    | http://www.example.com/item3-info-page.html |    20 | Acme  | used      | http://www.example.com/image3.jpg | 0000142404674 |        |               | 0.3 lb | "Media > Books > Fiction > Literature"                        |        1 | US:::5.95                             | US::0:                      |
     58 |  4 | 8" plush Care Bear | Small care bear, brand new, absolutely adorable!                                   | http://www.example.com/item4-info-page.html |  6.99 | Acme  | new       | http://www.example.com/image4.jpg |               | AB001  | 0789012345674 | 0.4 lb | "Toys & Games > Toys > Dolls & Action Figures > Stuffed Toys" |        5 | US:::5.95                             | US:CA:8.25:y                |
     59 
     60 #+begin_src text
     61   CL-USER> (olefs::parse-xls-file "products.xls")
     62   (:WORKBOOK
     63    (:SHEET "Products"
     64            (:LABEL 0 0 "id")
     65            (:LABEL 0 1 "title")
     66            (:LABEL 0 2 "description")
     67            (:LABEL 0 3 "link")
     68            (:LABEL 0 4 "price")
     69            (:LABEL 0 5 "brand")
     70            (:LABEL 0 6 "condition")
     71            (:LABEL 0 7 "image link")
     72            (:LABEL 0 8 "isbn")
     73            (:LABEL 0 9 "mpn")
     74            (:LABEL 0 10 "upc")
     75            (:LABEL 0 11 "weight")
     76            (:LABEL 0 12 "product type")
     77            (:LABEL 0 13 "quantity")
     78            (:LABEL 0 14 "shipping")
     79            (:LABEL 0 15 "tax")
     80            (:LABEL 1 0 "1")
     81            (:LABEL 1 1 "Red wool sweater")
     82            (:LABEL 1 2 "Comfortable and soft, this sweater will keep you warm on those cold winter nights.")
     83            (:LABEL 1 3 "http://www.example.com/item1-info-page.html")
     84            (:NUMBER 1 4 25.0D0)
     85            (:LABEL 1 5 "Acme")
     86            (:LABEL 1 6 "new")
     87            (:LABEL 1 7 "http://www.example.com/image1.jpg")
     88            (:LABEL 1 9 "ABC123")
     89            (:LABEL 1 10 "0001230001232")
     90            (:LABEL 1 11 "0.1 lb")
     91            (:LABEL 1 12 "\"Clothing & Accessories > Clothing > Outerwear > Sweaters\"")
     92            (:NUMBER 1 13 3.0D0)
     93            (:LABEL 1 14 "US:MA:Ground:5.95,US:024*:Ground:7.95")
     94            (:LABEL 1 15 "US:CA:8.25:y,US:926*:8.75:y")
     95            (:LABEL 2 0 "2")
     96            (:LABEL 2 1 "Total Recall")
     97            (:LABEL 2 2 "Slightly used copy of Total Recall, the sci-fi classic, on DVD.")
     98            (:LABEL 2 3 "http://www.example.com/item2-info-page.html")
     99            (:NUMBER 2 4 12.0D0)
    100            (:LABEL 2 5 "Acme")
    101            (:LABEL 2 6 "used")
    102            (:LABEL 2 7 "http://www.example.com/image2.jpg")
    103            (:LABEL 2 9 "XXYYZZ")
    104            (:LABEL 2 10 "0004560004560")
    105            (:LABEL 2 11 "0.2 lb")
    106            (:LABEL 2 12 "\"Media > DVDs & Videos > Science Fiction\"")
    107            (:NUMBER 2 13 1.0D0)
    108            (:LABEL 2 14 "US:MA:Ground:5.95")
    109            (:LABEL 2 15 "US::0:")
    110            (:LABEL 3 0 "3")
    111            (:LABEL 3 1 "Winnie the Pooh")
    112            (:LABEL 3 2 "Used copy. The adventures of Christopher Robin and his friends.")
    113            (:LABEL 3 3 "http://www.example.com/item3-info-page.html")
    114            (:NUMBER 3 4 20.0D0)
    115            (:LABEL 3 5 "Acme")
    116            (:LABEL 3 6 "used")
    117            (:LABEL 3 7 "http://www.example.com/image3.jpg")
    118            (:LABEL 3 8 "0000142404674")
    119            (:LABEL 3 11 "0.3 lb")
    120            (:LABEL 3 12 "\"Media > Books > Fiction > Literature\"")
    121            (:NUMBER 3 13 1.0D0)
    122            (:LABEL 3 14 "US:::5.95")
    123            (:LABEL 3 15 "US::0:")
    124            (:LABEL 4 0 "4")
    125            (:LABEL 4 1 "8\" plush Care Bear")
    126            (:LABEL 4 2 "Small care bear, brand new, absolutely adorable!")
    127            (:LABEL 4 3 "http://www.example.com/item4-info-page.html")
    128            (:NUMBER 4 4 6.99D0)
    129            (:LABEL 4 5 "Acme")
    130            (:LABEL 4 6 "new")
    131            (:LABEL 4 7 "http://www.example.com/image4.jpg")
    132            (:LABEL 4 9 "AB001")
    133            (:LABEL 4 10 "0789012345674")
    134            (:LABEL 4 11 "0.4 lb")
    135            (:LABEL 4 12 "\"Toys & Games > Toys > Dolls & Action Figures > Stuffed Toys\"")
    136            (:NUMBER 4 13 5.0D0)
    137            (:LABEL 4 14 "US:::5.95")
    138            (:LABEL 4 15 "US:CA:8.25:y")))
    139 #+end_src
    140 
    141 ** checkbook.xls example
    142 
    143 From http://sunburst.usd.edu/~bwjames/tut/excel/checkbook.xls
    144 
    145 | check # | date    | description                    | debit              | credit  | balance   |
    146 |---------+---------+--------------------------------+--------------------+---------+-----------|
    147 |         |         | open account (initial deposit) |                    | $500.00 | $500.00   |
    148 |     101 | 20/9/99 | Burger King                    | $8.57              |         | $491.43   |
    149 |         | 24/9/99 | Paycheck                       |                    | $539.00 | $1,030.43 |
    150 |     102 | 30/9/99 | Hy-Vee                         | $34.18             |         | $996.25   |
    151 |     103 | 22/9/99 | Electric Company               | $74.33             |         | $921.92   |
    152 |     104 | 23/9/99 | Cable TV                       | $24.56             |         | $897.36   |
    153 |         |         |                                |                    |         |           |
    154 |         |         |                                | equation           |         |           |
    155 |         |         |                                | previous balance   |         |           |
    156 |         |         |                                | subtract any debit |         |           |
    157 |         |         |                                | add any credit     |         |           |
    158 
    159 #+begin_src text
    160   CL-USER> (olefs::parse-xls-file "checkbook.xls")
    161   (:WORKBOOK
    162    (:SHEET "Sheet1"
    163            (:LABEL 0 0 "check #")
    164            (:LABEL 0 1 "date")
    165            (:LABEL 0 2 "description")
    166            (:LABEL 0 3 "debit")
    167            (:LABEL 0 4 "credit")
    168            (:LABEL 0 5 "balance")
    169            (:LABEL 1 2 "open account (initial deposit)")
    170            (:NUMBER 1 4 500.0D0)
    171            (:LABEL 2 2 "Burger King")
    172            (:NUMBER 2 3 8.57D0)
    173            (:NUMBER 3 1 34965.0D0)
    174            (:LABEL 3 2 "Paycheck ")
    175            (:NUMBER 3 4 539.0D0)
    176            (:LABEL 4 2 "Hy-Vee")
    177            (:NUMBER 4 3 34.18D0)
    178            (:LABEL 5 2 "Electric Company")
    179            (:NUMBER 5 3 74.33D0)
    180            (:LABEL 6 2 "Cable TV")
    181            (:NUMBER 6 3 24.56D0)
    182            (:LABEL 8 3 "equation")
    183            (:LABEL 9 3 "previous balance")
    184            (:LABEL 10 3 "subtract any debit")
    185            (:LABEL 11 3 "add any credit"))
    186    (:SHEET "Sheet2")
    187    (:SHEET "Sheet3"))
    188 #+end_src
    189 
    190 * Links
    191 
    192 Comments:
    193 http://www.reddit.com/r/lisp/comments/1axs8u/clolefs_step_towards_reading_ppt_doc_xls_in/
    194 
    195 Hans Hübner: Dealing with Excel files from Common Lisp - Using ABCL
    196 and Apache POI
    197 http://netzhansa.blogspot.com/2013/03/dealing-with-excel-files-from-common.html
    198 
    199 ABCL Dev: M$FT Excel format from Common Lisp.
    200 http://abcl-dev.blogspot.com/2013/03/mft-excel-format-from-common-lisp.html