I use SheetJS (xlsx) to load Excel files in JavaScript (TypeScript).
The usage is a bit peculiar and I forget how to use it easily, so this is a memo for my own use.
manual
install
npm install xlsx
File read
import * as xlsx from 'xlsx'; const book = xlsx.readFile('test.xlsx');
sheet
Sheets are retrieved by specifying the sheet name.
const sheet = book.Sheets['<sheet_name>'];
The sheet list is book.SheetNames, so to specify a sheet by its index, do the following
const sheet = book.Sheets[book.SheetNames[0]];
The check to show/hide sheets is complicated.
Sheet objects do not have show/hide information.
There is an array called "book.Workbook.Sheets" that contains the names and display states of the sheets that the workbook has, and we need to look at that array to find out.
Objects in the array have a "name" property for the sheet name and a "Hidden" property for the display status, with a "Hidden" value of "0" indicating display and "Hidden" indicating non-display.
For example, to check whether to show or hide by sheet name, the following is used.
const _sheets = book.Workbook.Sheets; const idx = _sheets.map((v:any)=>v.name).indexOf(<sheet_name>); console.log(_sheets[idx].Hidden); // 0 for display 1 or 2 for non-display
cell
Cells are addressed and retrieved using the A1 notation.
const cel = sheet['B2'];
Before describing the acquired cells, first an explanation on how to address the cells.
Address designation A1 notation" and "index notation
Use the "A1 notation" to specify to SheetJS (xlsx).
However, from a programmatic standpoint, it is easier to handle rows and columns by index (numerical value), so utility functions are provided to convert between A1 notation and "index notation.
In a program, an address is indicated in "index notation," converted to A1 notation, and then passed as a SheetJS argument.
index notation
cell
Rows are represented by r and columns by c.
The index starts with "0".
// 'A1' { c:0, r:0 }
area
The top left cell (start) is represented by s and the bottom right cell (end) by e.
// 'A1:B2' { s:{c:0, r:0}, e:{c:1, r:1} }
A1 and index conversion
cell
const id = xlsx.util.decode_cell('A1'); // {c:0, r:0} const a1 = xlsx.util.encode_cell({c:0, r:0}); // 'A1'
area
const id = xlsx.util.decode_range('A1:B2'); // { s:{c:0, r:0}, e:{c:1, r:1} } const a1 = xlsx.util.encode_range({ s:{c:0 ,r:0}, e:{c:1, r:1} }); // 'A1:B2'
Data range of the sheet
The data area of the sheet can be obtained with sheet['!ref'].
The area is expressed in A1 notation, so it is converted to "index notation" and used as the range when the for loop is turned.
const range_a1 = sheet['!ref']; const range = xlsx.util.decode_range(range_a1);
value acquisition
Values are obtained in units of cells, addressed in A1 notation.
Values are returned in JSON.
const cel = sheet['A1']; console.log(cel); // {v:123, w:123, t:'n'}
JSON Format
| key name | Content |
|---|---|
| v | raw data |
| w | Display Data |
| t | data type |
How to get the value
- The
wis the value that will be displayed when opened in Excel and is returned as a String. vis the actual data in the cell. For example, ifwis "1900/1/1",vis "1".- In general,
wis the most accurate way to retrieve data. - However, there may be cases where
wis not available, in which case the valuevis used. - Since
wis a String, if the type of the value to be obtained is clearly known, it can be converted and used. t(data type) should not be trusted and should be ignored.
blank cell
Blank cells will result in the retrieved cell being "undefined".
Get Date Value
- The basic
wis acquired and used. - In most cases,
tdoes not becomed(date type) even if it is a date display. - However, in very rare cases,
tbecomesd(date type). - In this case, create a Date from the value of
vas shown below.
import * as moment from 'moment'; const cell = sheet['A1']; let dt = null; if( cell.t==='d'){ dt = moment(cell.v); } else { dt = moment(cell.w); } console.log(dt.format('YYYY-MM-DD'));
JSON and CSV acquisition
There is a utility function to retrieve data from a sheet in JSON and CSV.
If the data on the sheet is tabular data with headers, it is easiest to use this to get the data in JSON.
CSV can be saved as a CSV file by outputting the file as is.
const json = xlsx.util.sheet_to_json(sheet); const csv = xlsx.util.sheet_to_json(sheet); fs.writeFileSync('test.csv', csv);
Sample (output all data in a sheet)
import * as xlsx from 'xlsx'; const book = xlsx.readFile('test.xlsx'); const sheet = book.Sheets[book.SheetNames[0]]; const range = xlsx.utils.decode_range(sheet['!ref']); for(let row=range.s.r; row<=range.e.r; row++){ for(let col=range.s.c; col<=range.e.c; col++){ const address = xlsx.utils.encode_cell({r:row, c:col}); const cell = sheet[address]; if(cell){ console.log( cell.w ? cell.w : cell.v ); } else { console.log( '' ); } } }
Impressions, etc.
I thought about putting together an Excel output as well, but decided not to because since I started using BI tools, I rarely output to Excel anymore, although I do read Excel.