The Concept
First think of the whole thing as if you were the person who invented it. Like this:
First think of an array and how it is implemented at the low level --> it is basically just a set of contiguous memory locations (memory locations that are next to each other). Now that you have that mental image in your head, think of the fact that you can access ANY of those memory locations and delete it at your will as you remove or add data in your array. Now think of that same array but instead of the possibility to delete any location you decide that you will delete only the LAST location as you remove or add data in your array. Now your new idea to manipulate the data in that array in that way is called LIFO which means Last In First Out. Your idea is very good because it makes it easier to keep track of the content of that array without having to use a sorting algorithm every time you remove something from it. Also, to know at all times what the address of the last object in the array is, you dedicate one Register in the Cpu to keep track of it. Now, the way that register keeps track of it is so that every time you remove or add something to your array you also decrement or increment the value of the address in your register by the amount of objects you removed or added from the array (by the amount of address space they occupied). You also want to make sure that that amount by which you decrement or increment that register is fixed to one amount (like 4 memory locations ie. 4 bytes) per object, again, to make it easier to keep track and also to make it possible to use that register with some loop constructs because loops use fixed incrementation per iteration (eg. to loop trough your array with a loop you construct the loop to increment your register by 4 each iteration, which would not be possible if your array has objects of different sizes in it). Lastly, you choose to call this new data structure a "Stack", because it reminds you of a stack of plates in a restaurant where they always remove or add a plate on the top of that stack.
The Implementation
As you can see a stack is nothing more than an array of contiguous memory locations where you decided how to manipulate it. Because of that you can see that you don't need to even use the special instructions and registers to control the stack. You can implement it yourself with the basic mov, add and sub instructions and using general purpose registers instead the ESP and EBP like this:
mov edx, 0FFFFFFFFh
; --> this will be the start address of your stack, furthest away from your code and data, it will also serve as that register that keeps track of the last object in the stack that i explained earlier. You call it the "stack pointer", so you choose the register EDX to be what ESP is normally used for.
sub edx, 4
mov [edx], dword ptr [someVar]
; --> these two instructions will decrement your stack pointer by 4 memory locations and copy the 4 bytes starting at [someVar] memory location to the memory location that EDX now points to, just like a PUSH instruction decrements the ESP, only here you did it manually and you used EDX. So the PUSH instruction is basically just a shorter opcode that actually does this with ESP.
mov eax, dword ptr [edx]
add edx, 4
; --> and here we do the opposite, we first copy the 4 bytes starting at the memory location that EDX now points to into the register EAX (arbitrarily chosen here, we could have copied it anywhere we wanted). And then we increment our stack pointer EDX by 4 memory locations. This is what the POP instruction does.
Now you can see that the instructions PUSH and POP and the registers ESP ans EBP were just added by Intel to make the above concept of the "stack" data structure easier to write and read. There are still some RISC (Reduced Instruction Set) Cpu-s that don't have the PUSH ans POP instructions and dedicated registers for stack manipulation, and while writing assembly programs for those Cpu-s you have to implement the stack by yourself just like i showed you.