Just adding to this, a few notes:
A field's .text property will always be a string. If you want to stuff non-string data into a field, like a number, list, or dictionary, and pull it back out as the same type you can use the .data property of the field instead. Either way is fine for simple applications.
With an auxiliary field (call it "lastpos") you can record pointer.pos and detect changes between frames (assuming the user didn't manage to precisely start and end on the same pixel within a 60th of a second). As Ahm noted, this approach is a bit brittle on touch-based devices, because they don't generally observe pointer movement except during clicks and drags:
on view do if !pointer.pos~lastpos.data me.data:0 else me.data:me.data+1 end lastpos.data:pointer.pos if me.data > 60*10 me.data:0 go[home] end end
I'd recommend using pointer.down or pointer.up to detect clicks instead of pointer.held. If a user makes a very short tap (pressing and releasing within a 60th of a second), a script may not observe pointer.held as truthy. The pointer.down and pointer.up flags will be truthy if the pointer was depressed or released (respectively) at any time during the previous frame, and thus avoid the potential race condition.
Does any of that make sense?