Latency is the time it takes for a user to receive a response from a system after a user interaction. Latency affects how interactive user perceives the system to be. When the user is burdened with excessive latency he may think the system is broken and may either repeat the interaction putting unnecessary stress on the system, or he may just give up on the task. Loading screens and spinners should be removed where possible, network delays should be masked by preloading data and anticipating what website resources the user will need for subsequent webpages or tasks. When the technical abilities to limit latency have been exhausted, the designer must deal with the latency by keeping the user informed, don’t let the user think an interaction was not registered. If the user has to wait then manage their expectations by keeping them informed of how long the delay is or by distracting them to keep them engaged with the system.
Bruce Tognazzini, First Principles of Interaction Design, accessed October 13, 2017, http://asktog.com/atc/principles-of-interaction-design/#latencyReduction
Original source image

http://asktog.com/atc/principles-of-interaction-design/#latencyReduction
Online image

Robot assisted surgery
https://www.nuh.com.sg/news/media-articles_837.html
Real world image

Card chip reader typically take about 5-10 seconds to complete a transaction. The delay is a limitation imposed by the technology, therefore to prevent the user from removing the card prematurely and getting into an error state the terminal is forced to display a wait message. Target, The District, Tustin

Hi Gary, you did a great job describing the concept of latency and how it affects the user frustration and perception of the system. I like how you provided ways to combat the negative effects of latency and introduced best practices in case network delays can’t be avoided: preloading data instead of loading screens and spinners and anticipating website resources for subsequent web pages or tasks. There’s a guideline in UX design that response times should be reduced to one-tenth of a second. (It’s the limit for having the user feel that the system is reacting instantaneously, meaning that no special feedback is necessary except to display the result.)
Unfortunately, latency reduction is a big issue on some of the software products at work which run on proprietary but unresponsive hardware. I see it all of the time in user testing, where the participant taps on the back button and after a delay, repeatedly presses the button, which triggers the system to advance multiple times and appear as a bug. However, a five second delay is inexcusable. I wish there was a way to deal with the latency by keeping the user informed beyond an active press state of the button. Unfortunately, even visual feedback isn’t always enough to prevent the user from thinking an interaction was not registered. I wish we could distract them to keep them engaged, even if it’s with a screen transition, but animations and advanced effects are unfortunately taxing on the system. This is an example of latency reduction at its worst (for me anyway), and it particularly hits home hard having personally witnessed it.