Really, SPI and I2C are serial as well, but I'm treating them separately; they are separate circuits on the '328 and provide different protocols.
So when we talk about serial input and output (I/O), we generally mean I/O using a USART (Universal Synchronous Asynchronous Receiver Transceiver).
Serial means one bit at a time. If we want to communicate an 8-bit value, one bit goes over the wire at a time. For an 8-bit parallel connection, all 8 bits go over 8 separate wires at the same time. The advantage of serial is that only 1 wire is needed for the data, the disadvantage is that it takes longer (one bit at a time!) and timing is more of an issue.
So now we have synchronous and asynchronous. This has to do with how we handle the timing issue. I.e. how can we indicate/determine when each bit is on the wire. For synchronous communication a clock is provided along with the data. Data values are synchronized with edges of the clock signal. For example data could be valid (and should be read) on a falling edge of the clock. In the '328 this is programmable. For asynchronous communications, both sides agree on how fast data will be transferred (the baud rate). There is a standard frame format involved to transfer one word of data. These standards have been around for a long time.
A word can be between 4 and 8 bits, and there is an optional parity bit that is used for error detection. If a parity bit is present, the receiving hardware can use it's value to detect a problem in the transmission. The parity bit is computed by XORing the bits of the data word. That will be the value of the parity bit if even parity is specified. The value for odd parity is that result inverted. Like the baud rate, word size and parity have to be agreed upon.
The line is usually held high, and transmission of a word begins with it going low for a bit's worth of time. Then it's followed by the data bits, possibly a parity bit, and a double width stop "bit".
Serial I/O takes a minimum of 3 wires: transmit (Tx), receive (Rx), and ground. More lines can be used to coordinate communication. Back in the day serial was one of the staple communication methods between devices. Modems which allowed data communication over the telephone system (anyone see the movie WarGames or the show Halt and Catch Fire) were serial in nature.
Printers were often serial as well. While parallel printer interfaces were common with home computers, larger installations had their printers some distance from the computer, and serial interfaces were much better at handling longer cables. Terminals (keyboard and display) were serial.
Even these days MCUs will pretty much always have serial I/O support, although it's often used with USB for connecting to a Windows/OSX/Linux system. Most MCU boards have an onboard converter to go between USB and serial (e.g. the Arduino UNO) or have USB support directly in the MCU itself (e.g. the ATmega32u4 MCU). Boards that don't (I'm thinking of the Arduino Pro Mini that uses the '328) will have an FTDI connector. That's just a serial interface with a specific arrangement of signals on a pin-header to which you can connect a USB <-> serial cable, or simply a logic-level serial cable.
Serial support on MCU boards is now typically logic level (i.e. signals are 0v for a low, or 3.3/5 v for a high). This is fine because serial connections these days are typically short: between the MCU board and a computer, or MCU board and a GPS sensor (which are often serial). When serial was used to connect terminals and printers to minicomputers, those terminals would be in a different room, a different floor, or even a different building. Logic level signals don't work over those distances. A standard called RS232 was developed to handle the challenge. It uses higher voltages (+/- 12 v was common) to generate more robust signals that could be used over those distances without degrading too much.