Keyboard actions
A representation of any key input device for interacting with a web page.
In addition to the high-level element interactions, the Actions API provides granular control over exactly what designated input devices can do. Selenium provides an interface for 3 kinds of input sources: a key input for keyboard devices, a pointer input for a mouse, pen or touch devices, and wheel inputs for scroll wheel devices (introduced in Selenium 4.2). Selenium allows you to construct individual action commands assigned to specific inputs and chain them together and call the associated perform method to execute them all at once.
In the move from the legacy JSON Wire Protocol to the new W3C WebDriver Protocol, the low level building blocks of actions became especially detailed. It is extremely powerful, but each input device has a number of ways to use it and if you need to manage more than one device, you are responsible for ensuring proper synchronization between them.
Thankfully, you likely do not need to learn how to use the low level commands directly, since almost everything you might want to do has been given a convenience method that combines the lower level commands for you. These are all documented in keyboard, mouse, pen, and wheel pages.
Pointer movements and Wheel scrolling allow the user to set a duration for the action, but sometimes you just need to wait a beat between actions for things to work correctly.
WebElement clickable = driver.findElement(By.id("clickable"));
new Actions(driver)
.moveToElement(clickable)
.pause(Duration.ofSeconds(1))
.clickAndHold()
.pause(Duration.ofSeconds(1))
.sendKeys("abc")
.perform();
clickable = driver.find_element(By.ID, "clickable")
ActionChains(driver)\
.move_to_element(clickable)\
.pause(1)\
.click_and_hold()\
.pause(1)\
.send_keys("abc")\
.perform()
IWebElement clickable = driver.FindElement(By.Id("clickable"));
new Actions(driver)
.MoveToElement(clickable)
.Pause(TimeSpan.FromSeconds(1))
.ClickAndHold()
.Pause(TimeSpan.FromSeconds(1))
.SendKeys("abc")
.Perform();
clickable = driver.find_element(id: 'clickable')
driver.action
.move_to(clickable)
.pause(duration: 1)
.click_and_hold
.pause(duration: 1)
.send_keys('abc')
.perform
const clickable = await driver.findElement(By.id('clickable'))
await driver.actions()
.move({ origin: clickable })
.pause(1000)
.press()
.pause(1000)
.sendKeys('abc')
.perform()
val clickable = driver.findElement(By.id("clickable"))
Actions(driver)
.moveToElement(clickable)
.pause(Duration.ofSeconds(1))
.clickAndHold()
.pause(Duration.ofSeconds(1))
.sendKeys("abc")
.perform()
An important thing to note is that the driver remembers the state of all the input items throughout a session. Even if you create a new instance of an actions class, the depressed keys and the location of the pointer will be in whatever state a previously performed action left them.
There is a special method to release all currently depressed keys and pointer buttons. This method is implemented differently in each of the languages because it does not get executed with the perform method.
((RemoteWebDriver) driver).resetInputState();
ActionBuilder(driver).clear_actions()
((WebDriver)driver).ResetInputState();
driver.action.release_actions
await driver.actions().clear()
(driver as RemoteWebDriver).resetInputState()
A representation of any key input device for interacting with a web page.
A representation of any pointer device for interacting with a web page.
A representation of a pen stylus kind of pointer input for interacting with a web page.
A representation of a scroll wheel input device for interacting with a web page.
Learn more or view the full list of sponsors.