Freescale MC9S12NE64 development with Imagecraft ICC12 and NoICE
This web page documents the development of a MC9S12NE64 based ethernet
to RS-485 & DMX & Colorkinetics board ("Leoplayer"), with
Secure Digital file
playback.
Board is 2-layer 3.5" x 2"
Project timeframe: November 2004 - Jan 2005
Client: Leo Villareal
Webpage revisions:
Initial incomplete draft: March 11 2005
Some additions, and probably the last revisions forever: June 24 2005
IMPORTANT UPDATE: Feb 20 2007:
subtle,
important bug in Freescale code fixed; see
here
Minor update May 19 2007: replaced pictures to show current design
revision, minor text changes, fix links
Top View:
Bottom View:
Top view of earlier version without
LCD
This new (circa Nov. 2004) HCS12 based chip has a lot of potential and
is making a
splash. For $13 qty 25 at digikey one can create a single-chip
ethernet-enabled anything. This project for a longtime client
allows streaming of data to a 16x16, 1' square grid of LEDs (see it here) from
anything that can generate a UDP packet - in particular his custom
Director-based video synthesis system. One of these boards
(called a "leoplayer") drives up to three panels at 625K baud over
RS-485
using a simple protocol. In addition, the leoplayer can play LED
pattern data files off a standard FAT16 SD media card to its panels
and/or broadcast via ethernet to other leoplayer(s). DMX output
was recently added as well, making it simple to control standard DMX
lighting via Ethernet or playback patterns stored on the SD. The
colorkinetics UDP packet format was also added, so this can play files
on the SD to colorkinetics' ethernet-enabled gear.
In an effort to cut initial outlay for development tools, Metrowerks'
$1000 (and up and up) C compiler / dev environment for this chip was not used. For convenient
out-of-the-box, up-and-running development Freescale provides a port of
Viola Systems' "OpenTCP" stack to the Metrowerks environment, so it was
clear very early on that this new chip was 'for real' and running
useful code.
Background
My client wanted an ethernet version of an earlier leoplayer design
that provided file playback from Smartmedia but had no connectivity to
a computer nor any means to synchronize playback of multiple
leoplayers. In addition, Smartmedia is an aging format clearly
heading for the dustbin of history. The older design is based on
an 18F452 PIC with relatively limited performance and a horrendously
buggy C compiler (CCS PCH). Time to move forward.
Essential features:
- Ethernet input to RS-485 output, RS-485 format
- File playback from media cards, with configurable randomization
of parameters and fading in&out
- Small size, since sometimes it has to go inside minimally-sized
art pieces
- Can be powered from 12V to 24V, since much Colorkinetics gear
runs on 24V (read: need a DC/DC to generate 5V and 3.3V)
- One Leoplayer playing from media can "broadcast" parts of each
frame to others over Ethernet to drive large installations from a
single media card
- Various test pattern generation built-in and status indicators
- Flexible playback setup via configuration file on media card
- Responds to "ping", and mini-webserver desirable
After looking into various microcontroller + MAC/PHY or microcontroller
w/MAC + PHY two-chip solutions, I came across the NE64.
Unfortunately it was brand spanking
new. Orders had a 6 weeks lead time, samples not available,
development boards not available. However, based on schematics in
application notes, a well-written datasheet, the availability of the
OpenTCP port and sample application, demo (time but not capability
limited) versions of Imagecraft's
ICC12 compiler & the NoICE debugger, and finally success getting
samples after several phone calls to Freescale, all the pieces came
together to try to do it a much cheaper and more flexible way.
A big motivation for the ICC12 / NoICE route is all the targets they
support. Instead of dropping $1K (or much more) and learning new
tools and porting code every time a design calls for a new chip, for
$300 new targets of Imagecraft's C compiler and NoICE can be purchased
so the same development environment and
existing proven C code can be used.
Phase 1. Hardware design.
First you gotta have hardware. Or at least be waiting for a board
design to come back from the pcb proto place. Since the choice of
development environment didn't impact the hardware, and no evaluation
board existed in 2004, the hardware design definitely came first.
The March 2005 issue of Circuit Cellar is a gold mine for this design -
if
it had arrived in November 2004.
Regardless, it has an excellent write-up of the same quirks and niggles
I ran across. The article is "Single-IC 10/100 Ethernet Solution"
by Fred Eady, go buy it before starting a design with this chip.
In addition, the article on FAT16 for SD/MMC would have been fabulous
to have, but prllc.com's FAT12/FAT16 SD library had already been
purchased and integrated. More on this in phase 4 below.
Follow application note AN2759.
Read it and re-read it.
Page 6 is the schematic, showing the 12.4K 1% for Rbias, which is not
documented anywhere else. Look over the chip data sheet as
well. Note split grounds in the board layout on page 17. I
always follow things like this, as far as I'm concerned if Freescale
(or any other large company) can be bothered to write it up with
diagrams and screen captures it must be important. Besides, 100
megabit ethernet is a lot of fast-moving signals, and taking shortcuts
is inviting disaster in the works-sometimes, can't-pin-down-the-problem
type of nightmares. Network problems created by software bugs and
general configuration mistakes are difficult enough to deal with, it
would be really bad if the hardware itself were flaky as well. I
followed the guidance in this app note, checked my board layout
carefully, and it worked fine.
Major items of note in this design:
0.1uF decoupling caps were added to all those power pins that don't
have .22uF as shown in the schematic.
Status LEDs were done by the ethernet hardware, not software. The
OpenTCP port can be compiled either way.
Ethernet jack with integrated LEDs and magnetics (mouser 673-J1012F01C,
it is Pulse J1012F01C)
SD socket (digikey 478-2018-1-ND)
LCD (mouser 696-LCM-S01602DTRM, it is Lumex LCM-S01602DTR/M.
Digikey carries it now as well.)
Mini-joystick (digikey 401-1130-1-ND and 401-1265-ND)
RS-485 transceiver (maxim MAX3088ECSA, but Maxim had multi-week lead
times on these, so it now uses a generic 75176)
All unused pins went to .1" headers or at least holes for wires to
future one-off add-ons
Watch out for:
- Vias under the SD socket pins, I didn't have a socket in hand
when I did the board layout, and when the SD card is inserted some of
the little 'fingers' that touch the SD card contacts could momentarily
touch vias. A piece of electrical tape on the pcb prevented this
in the prototype.
- Make sure the crystal is near the oscillator pins on the
chip. And you have to use a 25Mhz crystal if you want 100mbit
ethernet - datasheet page 361. This came to bite me later, as I
wound up needing 20Mhz
to generate 625K baud exactly. This was conveniently solved in
software by using the bus clock PLL to multiply by 4/5. A cool
feature of this chip. Note that several changes must be made if
not running at 25Mhz: BUS_CLOCK in ne64config.h, the EtherOpen() call
in ne64driver.c (pass MII_C20 as first argument), the RTI timer needs
tweaking - see RTI_init() in rti.c and datasheet page 178 (I used 0x73
for RTICTL_RTR, which measures 10.4msec on the scope), and any
hand-coded precision delays (I have a delay_us() I wrote in assembler
that does no-op loops for precision).
- This particular LCD is an unreliable PITA in 4-bit mode. A
number of hours were wasted fighting with it and the poor
documentation. If you use it, use it in 8-bit mode and yes you
waste 4 pins on the NE64 to do so. 4-bit was never 100% reliable
so I eventually patched the board to use it in 8-bit mode, which works
perfectly. I believe all the problems revolve around being unable
to reset it after it is put into 4-bit mode (apart from doing a
software controlled power-cycle, requiring external PFETs since the LCD
is 5V in a 3.3V world) so if the board power comes up badly and the LCD
needs a reset, or the NE64 resets (during debugging one does this all
the time) sending the LCD another go-into-4-bit-mode as per normal
startup essentially crashes it. Crap. No documentation
about any of this from Lumex about any of this although it is clear it
is a real problem. Shame on you Lumex for errors and omissions in
the
datasheet - that wasn't even available on your website. Your lack
of QC cost me time and my client money.
- The 6-pin BDM (programmer/debugger) connection is a
semi-standard; be sure to get the pins the right way around so you can
hook up a programmer to it without hassle. More below.
Phase 2. Software for Ethernet
connectivity
Somewhat concurrently with the board design, and full-time while
waiting for the pcbs, the software was developed.
OpenTCP and Freescale's port of it and demo application were done
fairly well. I found the structure straightforward to follow and
comments mostly correct and complete. When evaluating it while
deciding on which chip/platform to use for this project it seemed
pretty decent, and fortunately that was the case when it came time to
actually use it.
The first step - porting it all to Imagecraft's ICC12 compiler. A
time-but-not-codesize limited demo was available, so if the port turned
into a disaster, I could cut my losses and drop the bucks for
Metrowerks. I was surprised and pleased with how easy it
was. I like this compiler! I just wish the (simple) IDE
supported the mouse wheel, of all minor things that can become major
annoyances...
Please note: this port was done
with the Freescale OpenTCP port released September 1 2004, not version 1.0 released Feb 22
2005. As such some of the items below may no longer be relevant
and other things may need to be done.
While the ported code can't be released (for free anyway, see bottom of
page), here is what you need to know to do the port and configure the
project for ICC12. I'm about to save you days
of work and considerable hair-pulling so listen up.
- MC9S12NE64.h is a monster (10K lines) header file with every bit
of every register defined 9 different ways. It is pretty useful
actually. Unfortunately Metrowerks has a non-standard extension
whereby bitfields can be an unsigned char size. ICC12 requires
them to be unsigned int. Many hundreds of lines have to
change. So, a little Windows console program was written to do
this smartly. Additionally, the IEVENT and IMASK structures are
16 bits long, and the ordering of the bits differ between Metrowerks
and ICC12. To save re-inventing the wheel, here is the modified
MC9S12NE64.h, called MC9S12NE64_icc12.h
The file is copyrighted, but since it is publicly available from
Freescale I think its ok to provide this modified one here.
- How ISRs are declared. Er, I don't have an example of the
Metrowerks way anymore, but with ICC12 they are done like this. I
don't know if the nonpaged_function pragma is required or if that is
implied by interrupt_handler, but to be safe I explicity declare all
ISRs as nonpaged. More about paging below.
#pragma nonpaged_function
RealTimeInterrupt
#pragma interrupt_handler
RealTimeInterrupt
- Unsupported or irrelevant Metrowerks #pragmas... Er, its been too
long, don't remember. Whatever they were, when ICC12 complains,
fix or remove them. Wasn't hard.
- Moving where RAM is and the stack pointer. A few things
need to happen:
- The NE64 puts the three ethernet buffers at 0x2000, and the
arrays used for this space need to be defined like this: (in
ne64driver.c):
#define
RAM_START 0x2000
#pragma abs_address: RAM_START
tREG16 emacFIFOa[EMAC_RX_SZ/2] = {0};
/**< Emac RX buffer A definition */
tREG16 emacFIFOb[EMAC_RX_SZ/2] = {0};
/**< Emac RX buffer B definition */
tREG16 emacFIFOtx[EMAC_TX_SZ/2] = {0}; /**<
Emac TX buffer definition */
#pragma end_abs_address
- You have to load INITRM (register $10) with $20 prior to do
anything else. Point the reset vector to the below code, which
does this and then calls __start(), the ICC12 standard C runtime
startup routine:
void _jsl_start(void)
{
asm("movb #$20, $10"); //
INITRM= 0x20
asm("lbra __start");
}
- Paged functions. Unfortunately, with ICC12 you can either
mark all functions as paged or not paged by default. This means
if the paged or non-paged memory gets full, you have to go explicity
declare functions to be the other to fit in memory. For chips
with lots of paged mem and not much unpaged, this isn't a big
deal. But for the NE64 it is a drag since to get the full 64K you
need 32K paged, and 32K non-paged. What I did was have unpaged
functions be default, and put all ISRs, functions called by function
pointer, my _jsl_start() two-liner and main() in unpaged, and all the
TCP/IP stuff and my more "finished" functions in paged memory. I
just kept adding functions to paged memory until it was full, but of
course if you add significant code to a function it (or actually
whatever was the last one put in paged memory) might not fit anymore -
hence doing it to "finished" functions. And by looking at the map
file you can see what ICC12 put where to check your work and see how
much space is left. So for example at the top of ne64driver.h I
have:
#pragma paged_function EtherInit
MIIwrite MIIread EtherType EtherIoctl EtherGetPhysAddr EtherSend
#pragma paged_function ProcessPacket EtherOpen EtherClose
EmacDisable EmacEnable EmacControl EtherAbortTx
#pragma paged_function EtherPause EtherOtherTx
// next one needs to be #pragma'd in ne64api.c as well (line 94)
#pragma paged_function EtherStartFrameTransmission
One gotcha with using #pragma paged_function in a .h file is that if a
.c file that calls one of these functions doesn't #include the .h, the
linker can't find the function, and you pull your hair out wondering
why EtherInit() can't be found when it is clearly being compiled.
- Vector table. As I recall ... it was awhile ago ... I took
the vector table from the Freescale code and pretty much pasted it into
the one supplied as a sample by imagecraft. The last entry (reset
vector) points to _jsl_start, given above, not __start. The
datasheet page 88 has them listed as to which one is for what.
- Project settings. Oh so so important! Here they are,
for paged memory. Listen up, this took awhile to get
working. Go to Projects -> Options:
Compiler tab: Output format is S19 with
ASM/Source level debug
Target tab:
Custom device
Program memory: 0x4000.0x7FFF:0xC000.0xFEFF
Data memory: 0x2C00 (note this depends on size of
ethernet buffers, chosen in ne64config.h as BUFMAP. 0x2C00 is for
1K buffers)
Stack pointer: 0x4000
In Expanded Memory:
Enable is Checked, and Addr is
0xF0000.0xF7FFF
Paged functions default is UNchecked
(for my usage described above)
S2 Record type: Linear is selected, Map vector page
checked
Advanced: in Other Options have
"-bidata:0xFE00" (to put some sort of initializing data there, I
forget why I had to do this)
Things to watch out for:
- The RTI that produces a (nominal) 10msec tick, that OpenTCP uses
for all timing. To start with, the TIMERTIC setting is a little
off, it should be 95, not 100. The timers get updated in an ISR
(RTI.c RealTimeInterrupt() calling decrement_timers() in timers.c), and
it can be a little long - 130usec. This caused problems in this
design due to the way the LED panels connected to the leoplayer
work. If your design can't ignore interrupts for this long,
something has to change. I rewrote decrement_timers() to be a
little faster and had it check for my condition that had to be handled
quickly after each timer was decremented. Alternatively, you
could disable the timer during critical code - see RTI_Enable() and
RTI_Disable() in RTI.c. OpenTCP doesn't use them for anything
critical, if one is late or even if a bunch are missing no problem.
- Unique MAC addresses. They are supposed to be globally
unique (as in, the whole world), so the IEEE has a procedure set up for
handling this. For small fry companies who can do with only 4096
different ones, go here:
http://standards.ieee.org/regauth/oui/pilot-ind.html and bring $550
with you. In reality, they only have to be unique to the subnet
the board is on. How you solve this is your decision, but be sure
you don't have multiple boards with the same MAC address. This
causes no end of weird network flakyness. It can be easy to
accidentally have this happen in a lab environment, where there are a
couple of prototypes built with various stages of code on them.
- 10 or 100Mbit default? auto-negotiate on/off? This is
set in ne64config.h. I had 10 & 100Mbit auto-negotiate
working, then later discovered that at 10Mbit it would "link" but that
was it. Not sure what happened but for my uses this is a bug I'll
probably never investigate. Note that there is an errata
involving auto-negotiation, see the link to the the Freescale errata
.pdf at the bottom of the page.
- stack overflow. It appears the NE64 has no hardware means
to protect from this (maybe no microcontroller does...) and since the
RAM in this project was relatively scarce (due to buffers needed for
streaming out serial ports and reading from the SD) I was concerned
about overwriting RAM with the stack. So I did a little trick in
main() where I initialize stuff:
#ifdef
DEBUG_STACK
g_stackcheck= (uInt16*)((&_bss_end) + 16);
*g_stackcheck= 0xDEAD;
// wipe stack area to something so we
can see in debugger how much has
// been used.
{
uInt8 *currsp;
uInt8 *ptr;
uInt8 val;
asm("PSHY"); // push Y onto stack
asm("TSY"); // stack pointer ->
Y
asm("STY
%currsp"); // Y -> currsp
asm("PULY"); // pop Y offa stack
val= 0;
ptr=
(uInt8*)(&g_stackcheck[1]);
for (; ptr < currsp; )
{
*ptr++=
val;
val++;
}
last_margin= 0xFFFF;
}
#endif
So what is going on here is I use the
_bss_end compiler-generated variable to find the end of RAM, mark it
with 0xDEAD, and then with a bit of assembler get the stack pointer,
and fill from one word past 0xDEAD to the stack pointer with 00, 01,
... etc. Periodically in my main() loop I scan this area and make
sure 0xDEAD is still there and count how many 00, 01, ... there
are. Thus, as the system runs I get a debugging printf whenever
the stack has reached a "new low". Turns out it was all fine, I
had a considerable margin left.
- Important
bugfix: The function NE64Receive() in ne64api.c has a problem.
This function is called when the ethernet hardware interrupts with a
new data packet ready. The function disables the interrupt for
that buffer (A or B), then copies it to user RAM (the call to
mENQUEUE()). The problem is that mENQUEUE() can fail, under
conditions of high packet reception and relatively slow processing of
them in the main loop. In that case the disabled interrupt is
never enabled again. After this happens for the first time
packets start getting dropped left and right; after the second time it
is dead. I'm unclear on the exact mechanics to create this
situation, but in a
project Feb 2007 it took two people two days to fully diagnose this
failure, which had never been seen before. For a long time we all
thought it was power supply problems. An installation of 4
leoplayers would run fine for an hour then one would fail, but only
when being driven from some custom software for the installation - not
my test software. The revised code is as follows:
UINT16 NE64Receive (void
*PktBuffer, UINT16 len, UINT16 flags)
{
MBUF mp;
mp.data = (UINT8 *)PktBuffer;
mp.working_ptr = mp.data;
mp.len = len;
mp.status =
(MBUF_NOTEMPTY | flags);
if (mENQUEUE (&mp) == 0)
// if queue successful, ...
{
// then block interrupts.
if (mp.status &
IEVENT_RXACIF_MASK)
{
IMASK_RXACIE = 0; // Block IRQs on buffer A
}
else
{
IMASK_RXBCIE = 0; // Block IRQs on buffer B
}
}
return 0;
}
Phase 3. NoICE and pulling it all
together
Noice works well; the only serious thing missing is a
callback stack. But the stack is compiler dependent, and so code
would have to be added for each compiler out there, so I understand why
it isn't done. But it remains a significant omission.
In the Options -> Target Communications, set the following:
Interface & port: whatever you are using (see below for my setup)
Target Chip/Environment: MC9S12NE64 Flash (if you don't
have this, contact the author for an updated devices file).
Use PPAGE at 0030: checked
Use 24-bit Hardware...: checked
Use Flash/EEPROM Burner: checked
Bus frequency: 25Mhz
Loading the file to burn:
File -> Load
Choose (projectname).dbg, set Files of Type: to Imagecraft DBG files
Have S2 records have linear addresses be Checked
For the programmer, I bought pemicro.com's BDM Multilink, this is the
parallel port version, for $200. It works pretty well, but I
sometimes have problems doing resets (they sometimes take more than
30seconds sometimes for unknown reasons) and reading the chip
correctly, even with the parallel port patch for XP (see their
website). I recently got the USB version for $99, and to my
horror discovered that it is very
slow to program the chip - 64K takes (no joke) several minutes, vs. 30 seconds or
so for the parallel port version. This makes it nearly useless
for normal development where code is being programmed and tested
rapidly. All I can think is that it has been crippled so you have
to buy the $200 or $500 models, although it could be something up with
NoICE. There is no mention of speed (or lack thereof) in the
blurb on pemicro's website for this unit. Additionally, pemicro's
devices come with no software - just to burn a chip with a hex file you
have to buy more. (NoICE does the programming so it can be
avoided). So for now I'm taking a rather dim view of pemicro...
Phase 4. Software for SD
The
library from prllc.com was pretty good but not perfect; I had to tweak
things some things and pretty much rewrote the file reading to get
something like a 40X speedup. However it was quite readable and
worked fine and is reasonably compact. It doesn't support long
file names, however, which is a real drag, nor FAT32. (This was
the version in 2004, it has undoubtedly been improved).
Note that in the very same issue of Circuit Cellar where the NE64 is
discussed, a FAT library for SD is discussed. This would be an
excellent (and free) alternative.
Phase 5. Test / Utility software
for Windows
An important part of developing anything computer controlled
(particularly networked) is to have a simple test application to help
troubleshoot. Here is a screenshot
of that app, which generates an all-on 10%, all-on 100%, then chase
over a specified range for panels, dmx, colorkinetics, and the Hive project. Written in VC++, it
is very handy to have to quickly tweak to create custom test patterns
to exercise hardware during development and to check "corner cases"
that might not come up often in the field.
Another essential application is a utility (screenshot)
to transform sequence files from ascii text to binary for efficient SD
storage and fast playback. The source DMX data files are in some
kind of standardized human-readable format, and the source panel files
are a simple ascii format invented for this project. This app
reads them line by line, verifies that there are the correct number of
channels in each frame, and outputs the binary file. Shows
progress indication, can pick a directory to process all files in a
batch, etc.
Remaining Problems
Activity LED very dim. It looks like the NE64 can't sink much
current with this pin. I don't get it.
Update: found errata on Freescale's website (see link below),
turns out this is a known problem.
Non-working with Netgear auto-crossover ethernet switches. This
happens with some D-link as well; it seems to be related to whatever
chip is running the show inside them. Curiously Linksys (at least
the SD205) with auto-crossover works fine. I really don't get
it. If anyone has knowledge of what is going on (even without a
workaround) pls. let me know. For now I just have my client buy
SD205 boxes.
Schematics? Board layouts?
Gerbers? BOM? Software downloads? ...
Fortunately this project was a paying gig for a client.
Unfortunately this means the IP is not available for free
give-away. This design (or probably more usefully a customized
variation of it) could be licensed to interested parties depending on
all the details to be determined. If your budget can accomodate a
low-thousands range, feel free to contact me. Schematics and
board layout are in Eagle.
Home
Links (working as of 5/18/07)
Freescale
MC9S12NE64 home page
App
note AN2759 (.pdf)
Errata
for MC9S12NE64 (read this!)
Freescale's
port of OpenTCP
A forum
for Freescale chips, including the NE64, that unfortunately started
after this
design was done
Imagecraft's compilers
NoICE debugger / programmer
BDM module: go to pemicro.com and search for "BDM multilink" in
products, there is
no way to link to the product page directly
Progressive
Resources' FAT12/16 library for SD (company name has changed)
OpenTCP: Sourceforge
project
Circuit
Cellar March 2005