The Jump Bug

In the last post, I mentioned a bug I was working on that would cause all 0s to appear on the data bus after a JSR instruction was executed. This bug turned out to be way more elusive and time-sinking than it should have been.

When I had finished wiring up the basic system (EEPROMs, SRAM, one 65c22 I/O chip, address decoding, clock), I decided to test what I had to make sure everything was running correctly. I decided to go back and follow along with Ben Eater’s project where he starts experimenting with the 65c22. Initially, everything was going great, the blue LEDs were blinking brightly. I went ahead and picked up an LCD display to continue following along with Ben. That’s when everything started falling apart.

At first, the LCD seemed promising. It would flicker, the cursor would appear, but that’s where it would stop. After double-checking all of my wiring and making sure the code I was using matched what Ben was using, I still couldn’t get it to work. I decided to continue on with the LCD videos and then ran across the one that mentions the Busy Flag in the LCD controller. I implemented the changes to handle that in my code. I put it in a subroutine, as well as the code that sends instructions and characters to the display. I uploaded the new binary to the EEPROM and stuck it in ROL. Things went from bad to worse. I noticed that as soon as I jumped to a subroutine, the data bus would just start returning 0s. This was interpreted by the processor as a “break” instruction, so it read the IRQ vector from the EEPROM and jumped to that. Since, I had it set to 0, it just started executing random junk, which would eventually involve another “break” instruction and start the whole process over.

This seemed incredibly strange. I banged my head on it for a couple of days. I lost hair, weight and any sense of security. I tried several different code experiments and it always ended the same way: the data bus would start returning 0s on a JSR that sent instructions to the LCD controller. After making no progress, I decided to continue adding components to ROL. I needed to feel like I was making progress. I could keep the Jump Bug in the back of my mind while did something relatively mindless.

I went ahead and added the three remaining 65c22s my design originally called for. I daisy chained them, one to the next, starting with the original one I had been using with the LCD. So, the data and address lines went from the bus along the bottom of the breadboard, up to VIA #0, then from VIA #0 to VIA #1, and so on. I put in a small test program and started noticing unpredictable IRQ behavior. The IRQ handler was getting called at unexpected times. Initially, I thought it might be related to the Jump Bug. It was also causing the IRQ handler to be executed at unexpected times. But that was because it was executing a BRK instruction (opcode 0).

The four 65c22 VIAs daisy chained off the bus running horizontally out of frame. To the far right is the SC28L92 UART.

Having nightmarish visions of another problem that would tie me up for days, I loaded up the 65c22 data sheet in an act of utter desperation. As it turns out, the 65c22 can’t have the IRQ line shared with other 65c22s. I had to break out each IRQ line for each 65c22 and run them into some logic gates that would output a high signal normally, but go low whenever any of the 65c22 IRQ lines went low. I put that into the circuit and fixed the IRQ problem! I hoped that maybe that would fix the weird Jump Bug too. It didn’t.

At this point, I had received a few parts in the mail and decided to go ahead and wire those in. Namely, an SC28L92 dual UART, some nonvolatile SRAM and some real time clocks. I went ahead and wired up the UART. I couldn’t use it yet because I still needed to get a 3.6864MHz oscillator to use with it. I did wire up all the address and data lines and wired the IRQ line up to the multiplexer logic I put in for the VIAs.

The nonvolatile SRAM was even more straightforward. I just had to run some address and data lines to it and tie the chip select to the 74138 decoder I’m using for the address decoding logic. I wrote a test program to write some data to the chip. Then I turned off ROL and ran a program to read from that same memory location. The data I had written was still there! The nonvolatile SRAM chip is a weird beast. It has a built-in lithium battery that keeps the SRAM turned on when no external DC voltage is present. It is fast and, other than the battery backup, it’s just regular SRAM. I’m not sure I’ll keep it around.

A little bit refreshed after getting the wiring for the UART largely in place and getting the nonvolatile SRAM to work, I returned to the Jump Bug. I didn’t feel like doing any in-depth testing with anything else until I could get the system stable. And the Jump Bug was definitely not stable.

With the LCD controller datasheet, I decided to try to get it working from an Arduino. I could get the code worked out on a stable platform and then transfer that to 6502 assembly. It didn’t take more than a couple of hours to get it to work on the Arduino. Encouraged, I rewrote the 6502 program to do exactly what I had programmed the Arduino to do. Eagerly, I burned the new code onto the EEPROM and placed it back in the system.

Once again, I started getting 0s on the data bus after doing a JSR to setup the LCD display. It was starting to seem like the LCD display was messing with the bus. But how could it? It was connected to the I/O pins of the 65c22, not ROL’s data bus. Maybe it was a timing issue? Maybe the address decoding logic (which has more levels than the ROM and RAM) was causing the timing to the 65c22s to get messed up?

I wrote another small program that did nothing but jumps. It branched forward and back, did absolute jumps forward and back and jumped forward and back to subroutines. It worked perfectly. Maddening. I added some code to read from PORTA on the VIA inside one of the subroutines. It worked perfectly.

I put my LCD code back on the EEPROM and single stepped through it a few times, watching all the address/data and control lines on the Arduino bus monitor that was attached to the system. Finally, I paused on the instruction that returned the first 0 on the data bus. It was address e080. The address that had been accessed prior was e07f, so e080 is what I would expect. What’s special about e080? I removed the access to the VIA from the subroutine and ran it again. This time, it still failed at e080. But, I hadn’t written to or read from the VIA at all! I put in several nops and tried again. This time ROL executed each nop up until the second from the last and then started returning 0s again… starting at e080.

I paused ROL at e080 and turned on the multimeter. I checked the address lines running across the bottom of the breadboards. They were exactly what they should be: 1110 0000 1000 0000. Which is what I would expect, since the bus monitor was showing address e080, it only made sense that that is exactly what was on the bus.

But there was still a problem. I closely examined the address lines coming up from the bus to the EEPROMs… and then I found what the problem was. Finally! I had swapped address lines 8 and 9 at EEPROM 1. Since EEPROM 0 was daisy chained to it, that meant its address lines 8 and 9 were effectively swapped also. Whenever the code was accessing e080, it was actually accessing e100, which was empty, which means it was filled with… 0s!

I put the LCD code back on the EEPROM and stuck it in place. No more mysterious 0s coming in on the data bus! The only problem: the LCD still doesn’t work. But at least the system is stable now.

I went ahead and wrote another test program that just writes alternating patterns to both ports of all four VIAs. I’ve confirmed that all of the VIAs are sending the data written to them over their I/O ports. Next, I’ll check that reads work. I really want to get this LCD working. It will help with debugging and, at this point, it’s pretty much a personal Holy War..000000000000000000

Leave a Reply