Welcome to new things

[Technical] [Electronic work] [Gadget] [Game] memo writing

File drop with Selenium, puppeteer, and Playwright

For some time now, we have been putting together a list of how to use Selenium, puppeteer, and Playwright.

Since the summary is getting long, I will put the operations that require some explanation and procedures in a separate article.

Here is a summary of how to drop files in Selenium, puppeteer, and Playwright, respectively.

Introduction.

A file drop is envisioned as a page where a local file is dropped into a designated location on a web page, and the file is uploaded to the web.

It is like a file upload for Google Image Search.

I thought that Selenium, Puppeteer, and Playwright had functions to drop such files, but they do not.

So, we will respond by reproducing in JavaScript what happens when a local file is file-dropped on an element.

What is happening at the file drop

When a file is dropped on an element, an drop event is sent to the element.

The event information for the drop event is given an DataTransfer object, and the information dropped in DataTransfer is stored in file.

Loading local files

For security reasons, local files cannot be accessed from JavaScript inside web pages.

Therefore, the <input type="file" /> node is used as a bridge between the web page internals and local files.

Although the <input type="file" /> node can be created from JavaScript inside a web page, setting the local file path can only be done from outside the browser for security reasons.

So, after creating the <input type="file" /> node, set the local file path from outside the browser.

Then the <input type="file" /> node reads the local file and stores it in Element.files.

flow

Based on the above, the file drop process is as follows.

  • Create <input type="file" /> node
  • Set the local file path to the <input type="file" /> node
  • Create DataTransfer object
  • Set file of <input type="file" /> node to DataTransfer
  • Create drop event
  • Set DataTransfer to the event information for the drop event.
  • Send drop event to the element to be dropped

The following is an example implementation of a file drop using Google Image Search. The writing style is slightly different, but what they all do is the same.

Selenium

driver.get('https://www.google.com/imghp')

# open drop area
driver.find_element(
    By.CSS_SELECTOR,
    'form[role="search"] div[role="button"]:last-child'
).click()

# create <input type="file" />
input = driver.execute_script("""
    const _input = document.createElement('INPUT');
    _input.setAttribute('type', 'file');
    document.documentElement.appendChild(_input);
    return _input;
    """)

# set uploadfile path
input.send_keys(os.path.abspath("./data/img/sample.jpg"))

# get drop area
drop = driver.find_element(
    By.CSS_SELECTOR, 'form img:nth-child(1) + div + div')

# dispatch drop event
driver.execute_script("""
    const _drop = arguments[0];
    const _input = arguments[1];
    const _dataTransfer = new DataTransfer();
    _dataTransfer.items.add(_input.files[0]);
    const _event = new DragEvent('drop', {
        dataTransfer: _dataTransfer,
        bubbles: true,
        cancelable: true
    });
    _drop.dispatchEvent(_event);
    """, drop, input)

puppeteer

await page.goto('https://www.google.com/imghp');

let selector;

// open drop area
selector = 'form[role="search"] div[role="button"]:last-child';
await page.waitForSelector(selector);
await page.click(selector);

// create <input type="file" />
const input = await page.evaluateHandle(() => {
    const _input = document.createElement('INPUT');
    _input.setAttribute('type', 'file');
    document.documentElement.appendChild(_input);
    return _input;
}) as ElementHandle<HTMLInputElement>;

// set uploadfile path
await input.uploadFile('./data/img/sample.jpg');

// get drop area
selector = 'form img:nth-child(1) + div + div';
const drop = await page.waitForSelector(selector);

// dispatch drop event
await page.evaluate((_drop, _input) => {
    const _dataTransfer = new DataTransfer();
    _dataTransfer.items.add(_input.files[0]);
    const _event = new DragEvent('drop', {
        dataTransfer: _dataTransfer,
        bubbles: true,
        cancelable: true
    });
    _drop.dispatchEvent(_event);
}, drop, input);

Playwright

await page.goto('https://www.google.com/imghp');

// open drop area
await page.locator('form[role="search"] div[role="button"]:last-child').click();

// create <input type="file" />
const input = await page.evaluateHandle(() => {
    const _input = document.createElement('INPUT');
    _input.setAttribute('type', 'file');
    _input.id = 'id_drop_file';
    document.documentElement.appendChild(_input);
    return _input;
});

// set uploadfile path
await page.locator('#id_drop_file').setInputFiles('./data/img/sample.jpg');

// get drop area
const drop = await page
    .locator('form img:nth-child(1) + div + div')
    .evaluateHandle((dom: Element) => dom);

// dispatch drop event
await page.evaluate(arg => {
    const _drop = arg[0];
    const _input = arg[1] as HTMLInputElement;
    const _dataTransfer = new DataTransfer();
    _dataTransfer.items.add(_input.files[0]);
    const _event = new DragEvent('drop', {
        dataTransfer: _dataTransfer,
        bubbles: true,
        cancelable: true
    });
    _drop.dispatchEvent(_event);
}, [drop, input]);

Impressions, etc.

Playwright is convenient for normal use because locator hides JSHandle, but when JSHandle is needed, it becomes tricky to write.

I'm not talking about how to use Selenium, Puppeteer, or Playwright, but rather JavaScript. I also looked at Chrome DevTools Protocol to see if there was a better way, but it wasn't there.

Other Methods

At first, we did not know how to do this via <input type="file" />, and after reading the file on the program side, we sent the data to the JavaScript on the web page.

The information in the file is binary, but since only primitive data can be sent to JavaScript, the binary is once sent as a string in Base64, and then restored to binary on the JavaScript side of the web page.

The next step is to create file from the restored binary and connect it to DataTransfer.

It's a very forceful technique, but it still works. I learned a lot during this trial-and-error process, so I'll leave it here for your reference.

import * as mime from 'mime';

await page.goto('https://www.google.com/imghp');

// open drop area
await page.locator('form[role="search"] div[role="button"]:last-child').click();

// read file & binary to string
const filePath = './data/img/sample.jpg';
const mimeType = mime.getType(path.parse(filePath).ext);
const binaryBase64 = fs.readFileSync(filePath).toString('base64');

// create DataTransfer
const dataTransfer = await page.evaluateHandle((param) => {

    // string to binary
    const _binaryString = atob(param.data);
    const _len = _binaryString.length;
    const _bytes = new Uint8Array(_len);
    for (let i = 0; i < _len; ++i) {
        _bytes[i] = _binaryString.charCodeAt(i);
    }

    // create file
    const _file = new File([_bytes.buffer], param.path, { type: param.type });

    // create DataTransfer
    const _dataTransfer = new DataTransfer();
    _dataTransfer.items.add(_file);

    return _dataTransfer;

}, { data: binaryBase64, path: filePath, type: mimeType });

// dispatch drop event to drop area
await page.locator('form img:nth-child(1) + div + div')
    .dispatchEvent('drop', { dataTransfer });

Reference Articles

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com

www.ekwbtblog.com