Can someone point me to a book or website which explains these basics clearly ?
You can check this XML Tutorial with examples.
But what about the encoding part ? Why is that necessary ?
W3C provides explanation about encoding :
"The document character set for XML and HTML 4.0 is Unicode (aka ISO 10646). This means that HTML browsers and XML processors should behave as if they used Unicode internally. But it doesn't mean that documents have to be transmitted in Unicode. As long as client and server agree on the encoding, they can use any encoding that can be converted to Unicode..."